[jira] [Updated] (HBASE-5869) Move SplitLogManager splitlog taskstate and AssignmentManager RegionTransitionData znode datas to pb

2012-04-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5869:
-

Attachment: v4.txt

v4  I still have to convert the tests.  This is taking a while.  We serialize 
and deserialize from zk all over our codebase.  Takes a while to chase down all 
cases.

 Move SplitLogManager splitlog taskstate and AssignmentManager 
 RegionTransitionData znode datas to pb 
 -

 Key: HBASE-5869
 URL: https://issues.apache.org/jira/browse/HBASE-5869
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: firstcut.txt, secondcut.txt, v4.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5882) Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-5882:
-

 Summary: Prcoess RIT on master restart can try assigning the 
region if the region is found on a dead server instead of waiting for Timeout 
Monitor
 Key: HBASE-5882
 URL: https://issues.apache.org/jira/browse/HBASE-5882
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.1, 0.90.6
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.96.0, 0.94.1


Currently on  master restart if it tries to do processRIT, any region if found 
on dead server tries to avoid the nwe assignment so that timeout monitor can 
take care.
This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. 
I think we can handle this by triggering a new assignment with a new plan.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262400#comment-13262400
 ] 

Lars Hofhansl commented on HBASE-5864:
--

@Ram: You are right it is a small change.
Just wondering whether we actually need the part that changes public 
DataInputStream nextBlockAsStream(BlockType blockType) to public HFileBlock 
nextBlockWithBlockType(BlockType blockType).


 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262401#comment-13262401
 ] 

Lars Hofhansl commented on HBASE-5864:
--

Ah OK. Never mind, you need the Block to the checksumBytes.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262401#comment-13262401
 ] 

Lars Hofhansl edited comment on HBASE-5864 at 4/26/12 6:18 AM:
---

Ah OK. Never mind, you need the Block to get the checksumBytes.

  was (Author: lhofhansl):
Ah OK. Never mind, you need the Block to the checksumBytes.
  
 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262403#comment-13262403
 ] 

Lars Hofhansl commented on HBASE-5864:
--

I think I get the change now. +1

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262407#comment-13262407
 ] 

ramkrishna.s.vasudevan commented on HBASE-5161:
---

@Stack and @J-D

We seem to end up in the same problem.  We had some 32 reference files created 
out of which one was never selected in further compaction cycles.

We even tried to stop the writes for 2 to 3 hours but still the compaction did 
not pick up the reference file. Will dig in more into this.

 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262407#comment-13262407
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-5161 at 4/26/12 6:36 AM:


@Stack and @J-D

We seem to end up in the same problem.  We had some 32 reference files created 
out of which one was never selected in further compaction cycles.

We even tried to stop the writes for 2 to 3 hours but still the compaction did 
not pick up the reference file. 

The region grew upto 400G.
{code}
./hbase-root-regionserver-HOST-192-168-47-205.log:2012-04-26 09:44:15,835 DEBUG 
org.apache.hadoop.hbase.regionserver.Store: 
hdfs://10.18.40.217:9000/hbase/ufdr/ce5c144a1714df08db1132238a749116/value/cde90029ecb74ef791500ccd3a1e8908.755d1cf6b960c02cc72c1dd83551df82-hdfs://10.18.40.217:9000/hbase/ufdr/755d1cf6b960c02cc72c1dd83551df82/value/cde90029ecb74ef791500ccd3a1e8908-top
 is not splittable
{code}
We get the above logs for almost 2 to 3 hours.  The pair of this reference file 
(its bottom) is also not compacted.

Will dig in more to find any other reason for not getting picked up.

  was (Author: ram_krish):
@Stack and @J-D

We seem to end up in the same problem.  We had some 32 reference files created 
out of which one was never selected in further compaction cycles.

We even tried to stop the writes for 2 to 3 hours but still the compaction did 
not pick up the reference file. Will dig in more into this.
  
 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262456#comment-13262456
 ] 

jirapos...@reviews.apache.org commented on HBASE-2214:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4726/
---

(Updated 2012-04-26 08:18:40.576784)


Review request for hbase and Ted Yu.


Changes
---

Uploaded v5 of patch. The test still works.


Summary
---

HBASE-2214 per scan max buffersize.


This addresses bug HBASE-2214.
https://issues.apache.org/jira/browse/HBASE-2214


Diffs (updated)
-

  /src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java 1330680 
  /src/main/java/org/apache/hadoop/hbase/client/Scan.java 1330680 
  /src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java 1330680 
  /src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java 1330680 
  /src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java 
1330680 
  /src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java 
1330680 
  /src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java 
1330680 
  /src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java 
1330680 
  
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java 
1330680 
  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1330680 
  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1330680 
  /src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java 
1330680 
  /src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java 1330680 
  /src/main/protobuf/Client.proto 1330680 
  
/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
 1330680 

Diff: https://reviews.apache.org/r/4726/diff


Testing
---

It works when running this test:


new HBaseTestingUtility(conf).startMiniCluster();
 
HBaseAdmin admin = new HBaseAdmin(conf);
if (!admin.tableExists(test)) {
  HTableDescriptor tableDesc = new HTableDescriptor(test);
  tableDesc.addFamily(new HColumnDescriptor(fam));
  admin.createTable(tableDesc);
}


HTable table = new HTable(conf, test);
Put put; 

put = new Put(Bytes.toBytes(row1));
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual1),Bytes.toBytes(val1));
table.put(put);

put = new Put(Bytes.toBytes(row2));
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual2),Bytes.toBytes(val2));
table.put(put);

put = new Put(Bytes.toBytes(row3));
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual3),Bytes.toBytes(val3));
table.put(put);

table.flushCommits();
{
  System.out.println(returns all rows at once because of the caching);
  Scan scan = new Scan();
  scan.setCaching(100);
  ResultScanner scanner = table.getScanner(scan);
  scanner.next(100);
}
{
  System.out.println(returns one row at a time because of the 
maxResultSize);
  Scan scan = new Scan();
  scan.setCaching(100);
  scan.setMaxResultSize(1);
  ResultScanner scanner = table.getScanner(scan);
  scanner.next(100);
}


See output:

returns all rows at once because of the caching
2012-04-25 22:18:47,494 DEBUG [main] client.ClientScanner(94): Creating scanner 
over test starting at key ''
2012-04-25 22:18:47,494 DEBUG [main] client.ClientScanner(206): Advancing 
internal scanner to startKey at ''
2012-04-25 22:18:47,499 DEBUG [main] client.ClientScanner(323): Rows returned 3
2012-04-25 22:18:47,502 DEBUG [main] client.ClientScanner(193): Finished with 
scanning at {NAME = 'test,,1335385126388.ed23a82f3d6ca2eab571918843796259.', 
STARTKEY = '', ENDKEY = '', ENCODED = ed23a82f3d6ca2eab571918843796259,}
returns one row at a time because of the maxResultSize
2012-04-25 22:18:47,504 DEBUG [main] client.ClientScanner(94): Creating scanner 
over test starting at key ''
2012-04-25 22:18:47,505 DEBUG [main] client.ClientScanner(206): Advancing 
internal scanner to startKey at ''
2012-04-25 22:18:47,514 DEBUG [main] client.ClientScanner(323): Rows returned 1
2012-04-25 22:18:47,517 DEBUG [main] client.ClientScanner(323): Rows returned 1
2012-04-25 22:18:47,522 DEBUG [main] client.ClientScanner(323): Rows returned 1


Thanks,

ferdy



 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Attachments: 

[jira] [Updated] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread Ferdy Galema (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdy Galema updated HBASE-2214:


Status: Open  (was: Patch Available)

 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Attachments: HBASE-2214-0.94.txt, HBASE-2214-v4.txt, 
 HBASE-2214-v5.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262458#comment-13262458
 ] 

jirapos...@reviews.apache.org commented on HBASE-2214:
--



bq.  On 2012-04-25 21:02:15, Ted Yu wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java, line 
323
bq.   https://reviews.apache.org/r/4726/diff/3/?file=104298#file104298line323
bq.  
bq.   This should be removed.

Ok done.


- ferdy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4726/#review7242
---


On 2012-04-26 08:18:40, ferdy wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4726/
bq.  ---
bq.  
bq.  (Updated 2012-04-26 08:18:40)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  HBASE-2214 per scan max buffersize.
bq.  
bq.  
bq.  This addresses bug HBASE-2214.
bq.  https://issues.apache.org/jira/browse/HBASE-2214
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/client/Scan.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java 
1330680 
bq./src/main/protobuf/Client.proto 1330680 
bq.
/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
 1330680 
bq.  
bq.  Diff: https://reviews.apache.org/r/4726/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  It works when running this test:
bq.  
bq.  
bq.  new HBaseTestingUtility(conf).startMiniCluster();
bq.   
bq.  HBaseAdmin admin = new HBaseAdmin(conf);
bq.  if (!admin.tableExists(test)) {
bq.HTableDescriptor tableDesc = new HTableDescriptor(test);
bq.tableDesc.addFamily(new HColumnDescriptor(fam));
bq.admin.createTable(tableDesc);
bq.  }
bq.  
bq.  
bq.  HTable table = new HTable(conf, test);
bq.  Put put; 
bq.  
bq.  put = new Put(Bytes.toBytes(row1));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual1),Bytes.toBytes(val1));
bq.  table.put(put);
bq.  
bq.  put = new Put(Bytes.toBytes(row2));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual2),Bytes.toBytes(val2));
bq.  table.put(put);
bq.  
bq.  put = new Put(Bytes.toBytes(row3));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual3),Bytes.toBytes(val3));
bq.  table.put(put);
bq.  
bq.  table.flushCommits();
bq.  {
bq.System.out.println(returns all rows at once because of the 
caching);
bq.Scan scan = new Scan();
bq.scan.setCaching(100);
bq.ResultScanner scanner = table.getScanner(scan);
bq.scanner.next(100);
bq.  }
bq.  {
bq.System.out.println(returns one row at a time because of the 
maxResultSize);
bq.Scan scan = new Scan();
bq.scan.setCaching(100);
bq.scan.setMaxResultSize(1);
bq.ResultScanner scanner = table.getScanner(scan);
bq.scanner.next(100);
bq.  }
bq.  
bq.  
bq.  See output:
bq.  
bq.  returns all rows at once because of the caching
bq.  2012-04-25 22:18:47,494 DEBUG [main] client.ClientScanner(94): Creating 
scanner over test starting at key ''
bq.  2012-04-25 22:18:47,494 DEBUG [main] client.ClientScanner(206): Advancing 
internal scanner to startKey at ''
bq.  2012-04-25 22:18:47,499 DEBUG [main] client.ClientScanner(323): Rows 
returned 3
bq.  2012-04-25 22:18:47,502 DEBUG [main] client.ClientScanner(193): Finished 
with scanning at {NAME = 
'test,,1335385126388.ed23a82f3d6ca2eab571918843796259.', STARTKEY = '', ENDKEY 
= '', ENCODED = ed23a82f3d6ca2eab571918843796259,}
bq.  returns one row at a time 

[jira] [Updated] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread Ferdy Galema (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdy Galema updated HBASE-2214:


Status: Patch Available  (was: Open)

 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Attachments: HBASE-2214-0.94.txt, HBASE-2214-v4.txt, 
 HBASE-2214-v5.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread Ferdy Galema (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdy Galema updated HBASE-2214:


Attachment: HBASE-2214-v5.txt

 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Attachments: HBASE-2214-0.94.txt, HBASE-2214-v4.txt, 
 HBASE-2214-v5.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262459#comment-13262459
 ] 

jirapos...@reviews.apache.org commented on HBASE-2214:
--



bq.  On 2012-04-26 03:10:35, Jimmy Xiang wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java, 
line 385
bq.   https://reviews.apache.org/r/4726/diff/3/?file=104301#file104301line385
bq.  
bq.   Can we set it only if maxResultSize is  0?

Done. I've changed it everywhere else too so that maxResultSize only works when 
 0. (Setting to 0 is bogus anyway, it might cause infinite loops).


bq.  On 2012-04-26 03:10:35, Jimmy Xiang wrote:
bq.   /src/main/protobuf/Client.proto, line 197
bq.   https://reviews.apache.org/r/4726/diff/3/?file=104311#file104311line197
bq.  
bq.   Can we use uint64, without a default? So if it is not specified, we 
take it as -1.

Done. (Including regenerating the protobuffers sources). I already noticed that 
int64 was not used anywhere else in the proto file. Just out of curiosity, are 
there good reasons to avoid int64?


- ferdy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4726/#review7249
---


On 2012-04-26 08:18:40, ferdy wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4726/
bq.  ---
bq.  
bq.  (Updated 2012-04-26 08:18:40)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  HBASE-2214 per scan max buffersize.
bq.  
bq.  
bq.  This addresses bug HBASE-2214.
bq.  https://issues.apache.org/jira/browse/HBASE-2214
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/client/Scan.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java 
1330680 
bq./src/main/protobuf/Client.proto 1330680 
bq.
/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
 1330680 
bq.  
bq.  Diff: https://reviews.apache.org/r/4726/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  It works when running this test:
bq.  
bq.  
bq.  new HBaseTestingUtility(conf).startMiniCluster();
bq.   
bq.  HBaseAdmin admin = new HBaseAdmin(conf);
bq.  if (!admin.tableExists(test)) {
bq.HTableDescriptor tableDesc = new HTableDescriptor(test);
bq.tableDesc.addFamily(new HColumnDescriptor(fam));
bq.admin.createTable(tableDesc);
bq.  }
bq.  
bq.  
bq.  HTable table = new HTable(conf, test);
bq.  Put put; 
bq.  
bq.  put = new Put(Bytes.toBytes(row1));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual1),Bytes.toBytes(val1));
bq.  table.put(put);
bq.  
bq.  put = new Put(Bytes.toBytes(row2));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual2),Bytes.toBytes(val2));
bq.  table.put(put);
bq.  
bq.  put = new Put(Bytes.toBytes(row3));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual3),Bytes.toBytes(val3));
bq.  table.put(put);
bq.  
bq.  table.flushCommits();
bq.  {
bq.System.out.println(returns all rows at once because of the 
caching);
bq.Scan scan = new Scan();
bq.scan.setCaching(100);
bq.ResultScanner scanner = table.getScanner(scan);
bq.scanner.next(100);
bq.  }
bq.  {
bq.System.out.println(returns one row at a time because of the 
maxResultSize);
bq.Scan scan = new Scan();
bq.scan.setCaching(100);
bq.scan.setMaxResultSize(1);
bq.ResultScanner scanner = table.getScanner(scan);
bq.scanner.next(100);
bq.  }
bq.  
bq.  
bq.  See output:
bq.  

[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262462#comment-13262462
 ] 

nkeywal commented on HBASE-5877:


bq. Can we mark the failure and make this RegionMovedException behave the same 
as NotServingRegionException ?
Done.

bq. For updateCachedLocations(), please put explanation for parameter on the 
same line as the parameter:
Done.

bq. 'Failed all' - 'Failed call'
It's an existing comment that we can find again later in the code. It really 
means failed all: all the queries on this server failed. I don't mind 
changing it to something better, but I think we should keep the all.

bq. 'which the server' - 'which the region'
Done.

bq. Please increase the VERSION of HRegionInterface
Done.

bq. How is the server removed from cache since I see 'continue' above ?
That's what makes this code complex and difficult to change: the error is 
actually managed later, when we don't have the real exception anymore.

bq. For ServerManager.sendRegionClose(), please add javadoc for destServerName 
param.
Done.

bq. Is it possible that destServerName is null ?
Safety checks added.

bq. Please change the above to debug log.  Why is the above fatal 
(regionResult != null) ? Step 4 appears in a comment below the above code. 
Should the above say step 3 ?
Bad logs fixed.

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5611:


Attachment: HBASE-5611-94.patch
HBASE-5611-trunk-v2.patch

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262486#comment-13262486
 ] 

Hadoop QA commented on HBASE-2214:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524407/HBASE-2214-v5.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 3 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1654//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1654//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1654//console

This message is automatically generated.

 Do HBASE-1996 -- setting size to return in scan rather than count of rows -- 
 properly
 -

 Key: HBASE-2214
 URL: https://issues.apache.org/jira/browse/HBASE-2214
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Ferdy Galema
 Attachments: HBASE-2214-0.94.txt, HBASE-2214-v4.txt, 
 HBASE-2214-v5.txt, HBASE-2214_with_broken_TestShell.txt


 The notion that you set size rather than row count specifying how many rows a 
 scanner should return in each cycle was raised over in hbase-1966.  Its a 
 good one making hbase regular though the data under it may vary.  
 HBase-1966 was committed but the patch was constrained by the fact that it 
 needed to not change RPC interface.  This issue is about doing hbase-1966 for 
 0.21 in a clean, unconstrained way.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5611:


Attachment: (was: HBASE-5611-94.patch)

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5611:


Attachment: (was: HBASE-5611-trunk-v2.patch)

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5611:


Attachment: HBASE-5611-trunk-v2.patch
HBASE-5611-94.patch

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3529) Add search to HBase

2012-04-26 Thread Martin Alig (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262494#comment-13262494
 ] 

Martin Alig commented on HBASE-3529:


@Json: Are you still working on this issue?

 Add search to HBase
 ---

 Key: HBASE-3529
 URL: https://issues.apache.org/jira/browse/HBASE-3529
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.0
Reporter: Jason Rutherglen
 Attachments: HBASE-3529.patch, HDFS-APPEND-0.20-LOCAL-FILE.patch


 Using the Apache Lucene library we can add freetext search to HBase.  The 
 advantages of this are:
 * HBase is highly scalable and distributed
 * HBase is realtime
 * Lucene is a fast inverted index and will soon be realtime (see LUCENE-2312)
 * Lucene offers many types of queries not currently available in HBase (eg, 
 AND, OR, NOT, phrase, etc)
 * It's easier to build scalable realtime systems on top of already 
 architecturally sound, scalable realtime data system, eg, HBase.
 * Scaling realtime search will be as simple as scaling HBase.
 Phase 1 - Indexing:
 * Integrate Lucene into HBase such that an index mirrors a given region.  
 This means cascading add, update, and deletes between a Lucene index and an 
 HBase region (and vice versa).
 * Define meta-data to mark a region as indexed, and use a Solr schema to 
 allow the user to define the fields and analyzers.
 * Integrate with the HLog to ensure that index recovery can occur properly 
 (eg, on region server failure)
 * Mirror region splits with indexes (use Lucene's IndexSplitter?)
 * When a region is written to HDFS, also write the corresponding Lucene index 
 to HDFS.
 * A row key will be the ID of a given Lucene document.  The Lucene docstore 
 will explicitly not be used because the document/row data is stored in HBase. 
  We will need to solve what the best data structure for efficiently mapping a 
 docid - row key is.  It could be a docstore, field cache, column stride 
 fields, or some other mechanism.
 * Write unit tests for the above
 Phase 2 - Queries:
 * Enable distributed Lucene queries
 * Regions that have Lucene indexes are inherently available and may be 
 searched on, meaning there's no need for a separate search related system in 
 Zookeeper.
 * Integrate search with HBase's RPC mechanism

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5816) Balancer and ServerShutdownHandler concurrently reassigning the same region

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262499#comment-13262499
 ] 

ramkrishna.s.vasudevan commented on HBASE-5816:
---

@Maryann
If you are planning to work on this pls go ahead. :)

 Balancer and ServerShutdownHandler concurrently reassigning the same region
 ---

 Key: HBASE-5816
 URL: https://issues.apache.org/jira/browse/HBASE-5816
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6
Reporter: Maryann Xue
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Attachments: HBASE-5816.patch


 The first assign thread exits with success after updating the RegionState to 
 PENDING_OPEN, while the second assign follows immediately into assign and 
 fails the RegionState check in setOfflineInZooKeeper(). This causes the 
 master to abort.
 In the below case, the two concurrent assigns occurred when AM tried to 
 assign a region to a dying/dead RS, and meanwhile the ShutdownServerHandler 
 tried to assign this region (from the region plan) spontaneously.
 2012-04-17 05:44:57,648 INFO org.apache.hadoop.hbase.master.HMaster: balance 
 hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b., 
 src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 (offlining)
 2012-04-17 05:44:57,648 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Sent CLOSE to 
 serverName=hadoop05.sh.intel.com,60020,1334544902186, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0) for region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.
 2012-04-17 05:44:57,666 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
 node: /hbase/unassigned/fe38fe31caf40b6e607a3e6bbed6404b 
 (region=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  server=hadoop05.sh.intel.com,60020,1334544902186, state=RS_ZK_REGION_CLOSING)
 2012-04-17 05:52:58,984 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=CLOSED, ts=1334612697672, 
 server=hadoop05.sh.intel.com,60020,1334544902186
 2012-04-17 05:52:58,984 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
 master:6-0x236b912e9b3000e Creating (or updating) unassigned node for 
 fe38fe31caf40b6e607a3e6bbed6404b with OFFLINE state
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
 region TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.; 
 plan=hri=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b.,
  src=hadoop05.sh.intel.com,60020,1334544902186, 
 dest=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:52:59,096 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:19,159 DEBUG 
 org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
 was=TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. 
 state=PENDING_OPEN, ts=1334613179096, 
 server=xmlqa-clv16.sh.intel.com,60020,1334612497253
 2012-04-17 05:54:59,033 WARN 
 org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
 TABLE_ORDER_CUSTOMER,,1334017820846.fe38fe31caf40b6e607a3e6bbed6404b. to 
 serverName=xmlqa-clv16.sh.intel.com,60020,1334612497253, load=(requests=0, 
 regions=0, usedHeap=0, maxHeap=0), trying to assign elsewhere instead; retry=0
 java.net.SocketTimeoutException: Call to /10.239.47.87:60020 failed on socket 
 timeout exception: java.net.SocketTimeoutException: 12 millis timeout 
 while waiting for channel to be ready for read. ch : 
 java.nio.channels.SocketChannel[connected local=/10.239.47.89:41302 
 remote=/10.239.47.87:60020]
 at 
 org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:805)
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:778)
 at 
 org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:283)
 at $Proxy7.openRegion(Unknown Source)
 at 
 org.apache.hadoop.hbase.master.ServerManager.sendRegionOpen(ServerManager.java:573)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1127)
 at 
 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:912)
 at 

[jira] [Created] (HBASE-5883) Backup master is going down due to connection refused exception

2012-04-26 Thread Gopinathan A (JIRA)
Gopinathan A created HBASE-5883:
---

 Summary: Backup master is going down due to connection refused 
exception
 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.90.6, 0.94.0
Reporter: Gopinathan A


The active master node network was down for some time (This node contains 
Master,DN,ZK,RS). Here backup node got 
notification, and started to became active. Immedietly backup node got aborted 
with the below exception.

{noformat}
2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
[hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
 in 26374ms
2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
server abort: loaded coprocessors are: []
2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled 
exception. Starting shutdown.
java.io.IOException: java.net.ConnectException: Connection refused
at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
at $Proxy13.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
at 
org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
at 
org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
at 
org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
at 
org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
at 
org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
at 
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
... 20 more
2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping 
service threads
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5883) Backup master is going down due to connection refused exception

2012-04-26 Thread Jieshan Bean (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean reassigned HBASE-5883:
---

Assignee: Jieshan Bean

 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean

 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5883) Backup master is going down due to connection refused exception

2012-04-26 Thread Jieshan Bean (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262515#comment-13262515
 ] 

Jieshan Bean commented on HBASE-5883:
-

From the below log:
{noformat}
2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled 
exception. Starting shutdown.
java.io.IOException: java.net.ConnectException: Connection refused
{noformat}
We can deduce ConnectException was packaged as a IOException, likes below:

new IOException(new ConnecException(Connection refused));

or something likes:
new IOException(connectException.toString());

If so, this exception is not handled from the code.


 Backup master is going down due to connection refused exception
 ---

 Key: HBASE-5883
 URL: https://issues.apache.org/jira/browse/HBASE-5883
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: Jieshan Bean

 The active master node network was down for some time (This node contains 
 Master,DN,ZK,RS). Here backup node got 
 notification, and started to became active. Immedietly backup node got 
 aborted with the below exception.
 {noformat}
 2012-04-09 10:42:24,270 INFO org.apache.hadoop.hbase.master.SplitLogManager: 
 finished splitting (more than or equal to) 861248320 bytes in 4 log files in 
 [hdfs://192.168.47.205:9000/hbase/.logs/HOST-192-168-47-202,60020,1333715537172-splitting]
  in 26374ms
 2012-04-09 10:42:24,316 FATAL org.apache.hadoop.hbase.master.HMaster: Master 
 server abort: loaded coprocessors are: []
 2012-04-09 10:42:24,333 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 java.io.IOException: java.net.ConnectException: Connection refused
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:375)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1045)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:897)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
   at $Proxy13.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:183)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:303)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:280)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:332)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:236)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1276)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1233)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1220)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:569)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getRootServerConnection(CatalogTracker.java:369)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRootServerConnection(CatalogTracker.java:353)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyRootRegionLocation(CatalogTracker.java:660)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:616)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:540)
   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:488)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:328)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:362)
   ... 20 more
 2012-04-09 10:42:24,336 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
 2012-04-09 10:42:24,336 DEBUG org.apache.hadoop.hbase.master.HMaster: 
 Stopping service threads
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: 

[jira] [Reopened] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan reopened HBASE-5161:
---


 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262537#comment-13262537
 ] 

ramkrishna.s.vasudevan commented on HBASE-5161:
---

I did not want to open a new issue.  Thought i can reopen this issue.  Correct 
me if am wrong.

I have a high load write operation going on.  Region split keeps happening.
When to one of my region the load becomes too heavy the split starts and 
creates lot of reference files which is greater than my maxfilestocompact.

So suppose my 'hbase.hstore.compaction.max' is 20 and the reference files that 
are created is 32*2 (64 files).
In compaction selection
{code}
if (compactSelection.getFilesToCompact().size()  this.maxFilesToCompact) {
int pastMax =
  compactSelection.getFilesToCompact().size() - this.maxFilesToCompact;
compactSelection.clearSubList(0, pastMax);
  }
{code}

The filesToCompact is ordered based on seq id.  In this case the set of files 
from 0 to pastMax (i.e) the reference files which has lesser seq id are not 
considered for compaction. By the time more store files are created and once 
again the earlier created ones are avoided. Those being reference files, the 
split never happens.  The region grew upto 400G.

 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262537#comment-13262537
 ] 

ramkrishna.s.vasudevan edited comment on HBASE-5161 at 4/26/12 11:53 AM:
-

I did not want to open a new issue.  Thought i can reopen this issue.  Correct 
me if am wrong.

I have a high load write operation going on.  Region split keeps happening.
When to one of my region the load becomes too heavy the split starts and 
creates lot of reference files which is greater than my maxfilestocompact.

So suppose my 'hbase.hstore.compaction.max' is 20 and the reference files that 
are created is 32*2 (64 files).
In compaction selection
{code}
if (compactSelection.getFilesToCompact().size()  this.maxFilesToCompact) {
int pastMax =
  compactSelection.getFilesToCompact().size() - this.maxFilesToCompact;
compactSelection.clearSubList(0, pastMax);
  }
{code}

The filesToCompact is ordered based on seq id.  In this case the set of files 
from 0 to pastMax (i.e) the reference files which has lesser seq id are not 
considered for compaction. By the time more store files are created and once 
again the earlier created ones are avoided. Those being reference files, the 
split never happens.  The region grew upto 400G.

Note that - We even tried to stop the writes for 2 to 3 hours but still the 
compaction did not pick up the reference file. 

  was (Author: ram_krish):
I did not want to open a new issue.  Thought i can reopen this issue.  
Correct me if am wrong.

I have a high load write operation going on.  Region split keeps happening.
When to one of my region the load becomes too heavy the split starts and 
creates lot of reference files which is greater than my maxfilestocompact.

So suppose my 'hbase.hstore.compaction.max' is 20 and the reference files that 
are created is 32*2 (64 files).
In compaction selection
{code}
if (compactSelection.getFilesToCompact().size()  this.maxFilesToCompact) {
int pastMax =
  compactSelection.getFilesToCompact().size() - this.maxFilesToCompact;
compactSelection.clearSubList(0, pastMax);
  }
{code}

The filesToCompact is ordered based on seq id.  In this case the set of files 
from 0 to pastMax (i.e) the reference files which has lesser seq id are not 
considered for compaction. By the time more store files are created and once 
again the earlier created ones are avoided. Those being reference files, the 
split never happens.  The region grew upto 400G.
  
 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread Jieshan Bean (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262638#comment-13262638
 ] 

Jieshan Bean commented on HBASE-5161:
-

It's a really special scenario. High input pressure and splitting run in 
parallel caused this vicious circle.
We should check whether the compaction is still running. Compact a 400G region 
will take long time. 


 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.0


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262648#comment-13262648
 ] 

Zhihong Yu commented on HBASE-5611:
---

@Jieshan:
When you have multiple patches for different branches, attach patch for trunk 
apart from the other patches.
Otherwise Hadoop QA may pick up the wrong patch.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5611:
--

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Jieshan Bean (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262651#comment-13262651
 ] 

Jieshan Bean commented on HBASE-5611:
-

Thank you, Ted. I will take care.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262652#comment-13262652
 ] 

Zhihong Yu commented on HBASE-5611:
---

Patch v2 looks good in general.
Comment on formatting:
{code}
+   * @param regionName
+   *  region name.
{code}
The line length is 100 chars. Please put javadoc for param on the same line as 
param name.

You can wait for Hadoop QA result to come back before attaching new patches.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Jieshan Bean (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262656#comment-13262656
 ] 

Jieshan Bean commented on HBASE-5611:
-

I use the formatter from HBASE-3678. Every time, it format my code like that.
I will change after QA result.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262652#comment-13262652
 ] 

Zhihong Yu edited comment on HBASE-5611 at 4/26/12 2:58 PM:


Patch v2 looks good in general.
Comment on formatting:
{code}
+   * @param regionName
+   *  region name.
{code}
The line length limit is 100 chars. Please put javadoc for param on the same 
line as param name.

You can wait for Hadoop QA result to come back before attaching new patches.

  was (Author: zhi...@ebaysf.com):
Patch v2 looks good in general.
Comment on formatting:
{code}
+   * @param regionName
+   *  region name.
{code}
The line length is 100 chars. Please put javadoc for param on the same line as 
param name.

You can wait for Hadoop QA result to come back before attaching new patches.
  
 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262658#comment-13262658
 ] 

ramkrishna.s.vasudevan commented on HBASE-5875:
---

I would like to get some suggestions in this
{code}
boolean rit = this.assignmentManager.
  
processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.ROOT_REGIONINFO);
ServerName currentRootServer = null;
if (!catalogTracker.verifyRootRegionLocation(timeout)) {
  currentRootServer = this.catalogTracker.getRootLocation();
{code}
Consider the case where my ROOT node is found in RIT.  Hence the processRIT 
will trigger the assignment.

It so happened that when i try to verifyRootRegionLocation the root node is 
created but the OpenRegionHandler has not added the ROOT region in its 
memory(very very corner case and this happened once while testing).  So the 
verifyRootRegionLocation returns false and hence the master thinks it an server 
to be expired.  So we just remove an normal active RS from the master memory 
thinking it as dead.  So i lose a RS itself from the master's list of online 
servers.  How can we handle this scenario?

Can we retry the verifyRootRegionLocation if it returns false and the boolean 
variable 'rit' is true?
Or can we update the root region node in the RS side after updating the online 
server list? Suggestions welcome...

 Process RIT and Master restart may remove an online server considering it as 
 a dead server
 --

 Key: HBASE-5875
 URL: https://issues.apache.org/jira/browse/HBASE-5875
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.1


 If on master restart it finds the ROOT/META to be in RIT state, master tries 
 to assign the ROOT region through ProcessRIT.
 Master will trigger the assignment and next will try to verify the Root 
 Region Location.
 Root region location verification is done seeing if the RS has the region in 
 its online list.
 If the master triggered assignment has not yet been completed in RS then the 
 verify root region location will fail.
 Because it failed 
 {code}
 splitLogAndExpireIfOnline(currentRootServer);
 {code}
 we do split log and also remove the server from online server list. Ideally 
 here there is nothing to do in splitlog as no region server was restarted.
 So master, though the server is online, master just invalidates the region 
 server.
 In a special case, if i have only one RS then my cluster will become non 
 operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262673#comment-13262673
 ] 

Hadoop QA commented on HBASE-5611:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12524416/HBASE-5611-trunk-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 4 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1656//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1656//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1656//console

This message is automatically generated.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262698#comment-13262698
 ] 

Zhihong Yu commented on HBASE-5875:
---

bq. Or can we update the root region node in the RS side after updating the 
online server list?
Let's try this approach first.

The other approach would involve retry count, sleep interval, etc.

 Process RIT and Master restart may remove an online server considering it as 
 a dead server
 --

 Key: HBASE-5875
 URL: https://issues.apache.org/jira/browse/HBASE-5875
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.1
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.94.1


 If on master restart it finds the ROOT/META to be in RIT state, master tries 
 to assign the ROOT region through ProcessRIT.
 Master will trigger the assignment and next will try to verify the Root 
 Region Location.
 Root region location verification is done seeing if the RS has the region in 
 its online list.
 If the master triggered assignment has not yet been completed in RS then the 
 verify root region location will fail.
 Because it failed 
 {code}
 splitLogAndExpireIfOnline(currentRootServer);
 {code}
 we do split log and also remove the server from online server list. Ideally 
 here there is nothing to do in splitlog as no region server was restarted.
 So master, though the server is online, master just invalidates the region 
 server.
 In a special case, if i have only one RS then my cluster will become non 
 operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5611) Replayed edits from regions that failed to open during recovery aren't removed from the global MemStore size

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262702#comment-13262702
 ] 

Zhihong Yu commented on HBASE-5611:
---

Tests were clear.

@Jieshan:
Please address formatting and prepare patches for each branch.

We should also run test suite for 0.90 and 0.92 once patches are available.

Good job.

 Replayed edits from regions that failed to open during recovery aren't 
 removed from the global MemStore size
 

 Key: HBASE-5611
 URL: https://issues.apache.org/jira/browse/HBASE-5611
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
Reporter: Jean-Daniel Cryans
Assignee: Jieshan Bean
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1

 Attachments: HBASE-5611-94.patch, HBASE-5611-trunk-v2.patch, 
 HBASE-5611-trunk.patch


 This bug is rather easy to get if the {{TimeoutMonitor}} is on, else I think 
 it's still possible to hit it if a region fails to open for more obscure 
 reasons like HDFS errors.
 Consider a region that just went through distributed splitting and that's now 
 being opened by a new RS. The first thing it does is to read the recovery 
 files and put the edits in the {{MemStores}}. If this process takes a long 
 time, the master will move that region away. At that point the edits are 
 still accounted for in the global {{MemStore}} size but they are dropped when 
 the {{HRegion}} gets cleaned up. It's completely invisible until the 
 {{MemStoreFlusher}} needs to force flush a region and that none of them have 
 edits:
 {noformat}
 2012-03-21 00:33:39,303 DEBUG 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up 
 because memory above low water=5.9g
 2012-03-21 00:33:39,303 ERROR 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Cache flusher failed 
 for entry null
 java.lang.IllegalStateException
 at 
 com.google.common.base.Preconditions.checkState(Preconditions.java:129)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushOneForGlobalPressure(MemStoreFlusher.java:199)
 at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.java:223)
 at java.lang.Thread.run(Thread.java:662)
 {noformat}
 The {{null}} here is a region. In my case I had so many edits in the 
 {{MemStore}} during recovery that I'm over the low barrier although in fact 
 I'm at 0. It happened yesterday and it still printing this out.
 To fix this we need to be able to decrease the global {{MemStore}} size when 
 the region can't open.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262710#comment-13262710
 ] 

Zhihong Yu commented on HBASE-5877:
---

@N:
I think the following validation in real cluster would illustrate the benefit 
of this feature:
For given table, select a region server and note the row key ranges hosted by 
the region server. Direct client load to this server.
Kill the server at time T.

Difference in client response to region migration around time T with and 
without the patch would be interesting.

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5877:
--

Comment: was deleted

(was: @N:
I think the following validation in real cluster would illustrate the benefit 
of this feature:
For given table, select a region server and note the row key ranges hosted by 
the region server. Direct client load to this server.
Kill the server at time T.

Difference in client response to region migration around time T with and 
without the patch would be interesting.)

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262714#comment-13262714
 ] 

Zhihong Yu commented on HBASE-5877:
---

@N:
I think the following validation in real cluster would illustrate the benefit 
of this feature:
For given table, select a region server and note the row key ranges hosted by 
one region on the region server. Direct client load to this region.
Issue the following command in shell: 
{code}
  hbase move 'ENCODED_REGIONNAME', 'SERVER_NAME'
{code}
at time T.

Difference in client response to region migration around time T with and 
without the patch would be interesting.


 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262717#comment-13262717
 ] 

nkeywal commented on HBASE-5877:


Note that I'm currently rewriting the patch, as it conflicts with the protobuf 
stuff that was committed recently... But the logic hasn't changed.

@ted What we're saving in the current implementation is a call to the master. 
It can be interesting in itself if the region moves is used by a lot of 
clients. We could do better by letting the client know that the region is now 
fully available somewhere else and that there is no need to wait before 
retrying. But right now the region server only knows that the region is closed 
and moved to another server. It doesn't know if the region is opened yet. We 
could have this by adding the info in zk, but it would increase the zk load...

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5856) byte - String is not consistent between HBaseAdmin and HRegionInfo

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262724#comment-13262724
 ] 

stack commented on HBASE-5856:
--

Are you passing the name copied from the UI using double quotes or single 
quotes?  The shell acts differently if double quotes.  If double quotes it will 
pass to hbase unescaped region name.  At least, it used to do this.  Thanks 
Binlijin.

 byte - String is not consistent between HBaseAdmin and HRegionInfo
 

 Key: HBASE-5856
 URL: https://issues.apache.org/jira/browse/HBASE-5856
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.6, 0.92.1
Reporter: binlijin
 Attachments: HBASE-5856-0.92.patch


 In HBaseAdmin 
   public void split(final String tableNameOrRegionName)
   throws IOException, InterruptedException {
 split(Bytes.toBytes(tableNameOrRegionName));  // string - byte 
   }
 In HRegionInfo
   this.regionNameStr = Bytes.toStringBinary(this.regionName);  // byte - 
 string
 Should we use Bytes.toBytesBinary in HBaseAdmin ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262731#comment-13262731
 ] 

Zhihong Yu commented on HBASE-5877:
---

@N:
If the testing result is favorable, I think Lars may want it in 0.94 as well.
I think making this feature functional in 0.94 cluster would be a good start.

bq. We could have this by adding the info in zk
A separate discussion should be started w.r.t. the above. This would shift load 
imposed by clients from master to zk quorum.

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5859) Optimize the rolling restart script

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262732#comment-13262732
 ] 

stack commented on HBASE-5859:
--

I like the idea of regionservers being allocated a range of ports rather than 
an explicit one.  Ditto for its UI.  We'd have to do the tooling to support 
moving ports first -- the port the client talks to and that of the UI (maybe 
one day we just remove the UI from regionservers; instead have a separate UI 
app that is fed via jmx, etc) -- and then once that was done, we could do nice 
tricks like this.

Can you think of any other tricks we could do if the ports regionservers 
answered on were fuzzy?

 Optimize the rolling restart script
 ---

 Key: HBASE-5859
 URL: https://issues.apache.org/jira/browse/HBASE-5859
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 There is a graceful_stop script. This algorithm:
 {noformat}
 for i = 0 to servers.size {
  regionsInServer = servers[i].regions
  move servers[i].regions to random
  stop servers[i]
  start servers[i]
  move regionsInServer to servers[i] //filled back with the same regions
 }
 {noformat}
 It would be possible to optimize it while keeping data locality with
 {noformat}
 for i = 0 to servers.size {
  start servers[i*2+1] on the computer of servers[i] // Two RS on the same box
  move servers[i].regions to servers[i*2+1]  // The one on the same box
  stop servers[i]
 }
 {noformat}
 There would be an impact with a fixed port configuration. To fix this, we 
 could:
 - use a range of port instead of a single port. This could be an issue for 
 the web port.
 - start on a port then reuse the fixed ones when they become available. This 
 is not very elegant if a client code is already using the previous code. 
 Moreover the region server code is written in the meta table.
 - do a mix of the two solutions: a range for the server itself, while waiting 
 for the web port to be available.
 To be discussed...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5548) Add ability to get a table in the shell

2012-04-26 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5548:
---

Attachment: ruby_HBASE-5548-v5.patch

Attaching patch with fixes as per stack. Turns out the get issue was just a 
misspelling I didn't catch; _think_ everything else is working.

Also, this includes a slight refactor so the help text actually lives in the 
table class, not remotely in table_help.rb. If it was enough that I forgot 
where it lived, then others will too :)

Should be good to go.

 Add ability to get a table in the shell
 ---

 Key: HBASE-5548
 URL: https://issues.apache.org/jira/browse/HBASE-5548
 Project: HBase
  Issue Type: Improvement
  Components: shell
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0, 0.94.1

 Attachments: ruby_HBASE-5528-v0.patch, ruby_HBASE-5548-v1.patch, 
 ruby_HBASE-5548-v2.patch, ruby_HBASE-5548-v3.patch, ruby_HBASE-5548-v5.patch


 Currently, all the commands that operate on a table in the shell first have 
 to take the table as name as input. 
 There are two main considerations:
 * It is annoying to have to write the table name every time, when you should 
 just be able to get a reference to a table
 * the current implementation is very wasteful - it creates a new HTable for 
 each call (but reuses the connection since it uses the same configuration)
 We should be able to get a handle to a single HTable and then operate on that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2214) Do HBASE-1996 -- setting size to return in scan rather than count of rows -- properly

2012-04-26 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262734#comment-13262734
 ] 

jirapos...@reviews.apache.org commented on HBASE-2214:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4726/#review7268
---


Looks good. It seems you are using an old version protoc, which is ok. Great 
stuff.

- Jimmy


On 2012-04-26 08:18:40, ferdy wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/4726/
bq.  ---
bq.  
bq.  (Updated 2012-04-26 08:18:40)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  HBASE-2214 per scan max buffersize.
bq.  
bq.  
bq.  This addresses bug HBASE-2214.
bq.  https://issues.apache.org/jira/browse/HBASE-2214
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/client/Scan.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/RequestConverter.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/AdminProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/HBaseProtos.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCProtos.java 
1330680 
bq.
/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/RegionScanner.java 
1330680 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/RegionServer.java 
1330680 
bq./src/main/protobuf/Client.proto 1330680 
bq.
/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
 1330680 
bq.  
bq.  Diff: https://reviews.apache.org/r/4726/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  It works when running this test:
bq.  
bq.  
bq.  new HBaseTestingUtility(conf).startMiniCluster();
bq.   
bq.  HBaseAdmin admin = new HBaseAdmin(conf);
bq.  if (!admin.tableExists(test)) {
bq.HTableDescriptor tableDesc = new HTableDescriptor(test);
bq.tableDesc.addFamily(new HColumnDescriptor(fam));
bq.admin.createTable(tableDesc);
bq.  }
bq.  
bq.  
bq.  HTable table = new HTable(conf, test);
bq.  Put put; 
bq.  
bq.  put = new Put(Bytes.toBytes(row1));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual1),Bytes.toBytes(val1));
bq.  table.put(put);
bq.  
bq.  put = new Put(Bytes.toBytes(row2));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual2),Bytes.toBytes(val2));
bq.  table.put(put);
bq.  
bq.  put = new Put(Bytes.toBytes(row3));
bq.  
put.add(Bytes.toBytes(fam),Bytes.toBytes(qual3),Bytes.toBytes(val3));
bq.  table.put(put);
bq.  
bq.  table.flushCommits();
bq.  {
bq.System.out.println(returns all rows at once because of the 
caching);
bq.Scan scan = new Scan();
bq.scan.setCaching(100);
bq.ResultScanner scanner = table.getScanner(scan);
bq.scanner.next(100);
bq.  }
bq.  {
bq.System.out.println(returns one row at a time because of the 
maxResultSize);
bq.Scan scan = new Scan();
bq.scan.setCaching(100);
bq.scan.setMaxResultSize(1);
bq.ResultScanner scanner = table.getScanner(scan);
bq.scanner.next(100);
bq.  }
bq.  
bq.  
bq.  See output:
bq.  
bq.  returns all rows at once because of the caching
bq.  2012-04-25 22:18:47,494 DEBUG [main] client.ClientScanner(94): Creating 
scanner over test starting at key ''
bq.  2012-04-25 22:18:47,494 DEBUG [main] client.ClientScanner(206): Advancing 
internal scanner to startKey at ''
bq.  2012-04-25 22:18:47,499 DEBUG [main] client.ClientScanner(323): Rows 
returned 3
bq.  2012-04-25 22:18:47,502 DEBUG [main] client.ClientScanner(193): Finished 
with scanning at {NAME = 
'test,,1335385126388.ed23a82f3d6ca2eab571918843796259.', STARTKEY = '', ENDKEY 
= '', ENCODED = ed23a82f3d6ca2eab571918843796259,}
bq.  returns one row at a time because of the maxResultSize
bq.  2012-04-25 22:18:47,504 DEBUG [main] client.ClientScanner(94): Creating 
scanner over test starting at key ''
bq.  2012-04-25 22:18:47,505 

[jira] [Reopened] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-26 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal reopened HBASE-5844:



There is a regression when the cluster is fully distributed: the start command 
hangs. I'm on it. In the meantime, would it be possible to undo the commit?

Sorry about this.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5620) Convert the client protocol of HRegionInterface to PB

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262742#comment-13262742
 ] 

Zhihong Yu commented on HBASE-5620:
---

Is there plan to adopt various measures to counter this 8% performance dip ?

 Convert the client protocol of HRegionInterface to PB
 -

 Key: HBASE-5620
 URL: https://issues.apache.org/jira/browse/HBASE-5620
 Project: HBase
  Issue Type: Sub-task
  Components: ipc, master, migration, regionserver
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5620-sec.patch, hbase-5620_v3.patch, 
 hbase-5620_v4.patch, hbase-5620_v4.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5672) TestLruBlockCache#testBackgroundEvictionThread fails occasionally

2012-04-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5672:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Thanks for the patch Chunhui.  Committed to trunk.

 TestLruBlockCache#testBackgroundEvictionThread fails occasionally
 -

 Key: HBASE-5672
 URL: https://issues.apache.org/jira/browse/HBASE-5672
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-5672.patch, HBASE-5672v2.patch, HBASE-5672v3.patch


 We find TestLruBlockCache#testBackgroundEvictionThread fails occasionally.
 I think it's a problem of the test case.
 Because runEviction() only do evictionThread.evict():
 {code}
 public void evict() {
   synchronized(this) {
 this.notify(); // FindBugs NN_NAKED_NOTIFY
   }
 }
 {code}
 However when we call evictionThread.evict(), the evictionThread may haven't 
 been in run() in the TestLruBlockCache#testBackgroundEvictionThread.
 If we run the test many times, we could find failture easily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262748#comment-13262748
 ] 

Zhihong Yu commented on HBASE-5862:
---

@Stack:
Do you have suggestions on further improvement for the latest patch ?

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262754#comment-13262754
 ] 

stack commented on HBASE-5862:
--

+1 on patch.  Please add more comments around why you are hacking into hadoop 
metrics.  Please also paste some pretty pictures so Lars can see why this has 
to be in 0.94.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5862:
-

 Priority: Critical  (was: Minor)
Fix Version/s: 0.94.0

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5884) MapReduce package info has broken link to bulk-loads

2012-04-26 Thread Jesse Yates (JIRA)
Jesse Yates created HBASE-5884:
--

 Summary: MapReduce package info has broken link to bulk-loads
 Key: HBASE-5884
 URL: https://issues.apache.org/jira/browse/HBASE-5884
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Trivial
 Fix For: 0.96.0, 0.94.1


Bulk Loads link goes to an old link, which we have dropped recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5884) MapReduce package info has broken link to bulk-loads

2012-04-26 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5884:
---

Attachment: doc_HBASE-5884.patch

Attaching one-liner fix.

 MapReduce package info has broken link to bulk-loads
 

 Key: HBASE-5884
 URL: https://issues.apache.org/jira/browse/HBASE-5884
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
Priority: Trivial
 Fix For: 0.96.0, 0.94.1

 Attachments: doc_HBASE-5884.patch


 Bulk Loads link goes to an old link, which we have dropped recently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-5862:
-

Attachment: TSD.png

Stack wanted a screen shot of what all the recent region metrics work has 
enabled.

Here's a shot of TSDB showing average time of a multi put over all the regions 
of a table.  It illustrates regions being split and new regions showing up.

stack had some comments about places that needed some extra info I'll ge those 
trivial patches up in a sec.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-04-26 Thread Bohdan Mushkevych (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262770#comment-13262770
 ] 

Bohdan Mushkevych commented on HBASE-3996:
--

@stack, @Ted Yu, @Lars Hofhansl
Gentlemen, it seams that last changes [1] were submitted 4 weeks ago.
My personal fear is that the ticket will get outdated due to Trunk changes 
and will miss 0.94-0.96 target.

[1] Diff r7 
https://reviews.apache.org/r/4411/diff/7/

 Support multiple tables and scanners as input to the mapper in map/reduce jobs
 --

 Key: HBASE-3996
 URL: https://issues.apache.org/jira/browse/HBASE-3996
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce
Reporter: Eran Kutner
Assignee: Eran Kutner
 Fix For: 0.96.0

 Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, 3996-v5.txt, 
 3996-v6.txt, 3996-v7.txt, HBase-3996.patch


 It seems that in many cases feeding data from multiple tables or multiple 
 scanners on a single table can save a lot of time when running map/reduce 
 jobs.
 I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262773#comment-13262773
 ] 

stack commented on HBASE-5844:
--

I reverted the patch from trunk.

 Delete the region servers znode after a regions server crash
 

 Key: HBASE-5844
 URL: https://issues.apache.org/jira/browse/HBASE-5844
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, scripts
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch


 today, if the regions server crashes, its znode is not deleted in ZooKeeper. 
 So the recovery process will stop only after a timeout, usually 30s.
 By deleting the znode in start script, we remove this delay and the recovery 
 starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262778#comment-13262778
 ] 

Hadoop QA commented on HBASE-5862:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524468/TSD.png
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1657//console

This message is automatically generated.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Zhihong Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5862:
--

Comment: was deleted

(was: -1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524468/TSD.png
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1657//console

This message is automatically generated.)

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5881) BuiltIn Gzip compressor decompressor not getting pooled, leading to native memory leak

2012-04-26 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5881:
--

Fix Version/s: (was: 0.94.0)
   0.94.1

 BuiltIn Gzip compressor  decompressor not getting pooled, leading to native 
 memory leak
 

 Key: HBASE-5881
 URL: https://issues.apache.org/jira/browse/HBASE-5881
 Project: HBase
  Issue Type: Bug
  Components: io
Affects Versions: 0.92.1
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.92.2, 0.96.0, 0.94.1


 This issue will occur only in hadoop 23.x  above/
 In hadoop 0.20.x
 {code}
 public static void returnDecompressor(Decompressor decompressor) {
 if (decompressor == null) {
   return;
 }
 decompressor.reset();
 payback(decompressorPool, decompressor);
   }
 {code}
 In hadoop 0.23.x
 {code}
   public static void returnDecompressor(Decompressor decompressor) {
 if (decompressor == null) {
   return;
 }
 // if the decompressor can't be reused, don't pool it.
 if (decompressor.getClass().isAnnotationPresent(DoNotPool.class)) {
   return;
 }
 decompressor.reset();
 payback(decompressorPool, decompressor);
   }
 {code}
 Here annotation has been added. By default this library will be loaded if 
 there are no native library.
 {code}
 @DoNotPool
 public class BuiltInGzipDecompressor
 {code}
 Due to this each time new compressor/decompressor will be loaded, this leads 
 to native memory leak.
 {noformat}
 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new decompressor [.gz]
 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new decompressor [.gz]
 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new decompressor [.gz]
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5872) Improve hadoopqa script to include checks for hadoop 0.23 build

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262787#comment-13262787
 ] 

stack commented on HBASE-5872:
--

+1 on option #1.  +1 on the patch.  +1 on what Ted says above that it'd be best 
if compile against 0.23 succeeded before committing this.

 Improve hadoopqa script to include checks for hadoop 0.23 build
 ---

 Key: HBASE-5872
 URL: https://issues.apache.org/jira/browse/HBASE-5872
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
 Attachments: hbase-5872.patch


 There have been a few patches that have made it into hbase trunk that have 
 broken the compile of hbase against hadoop 0.23.x, without being known for a 
 few days.
 We could have the bot do a few things:
 1) verify that patch compiles against hadoop 23
 2) verify that unit tests pass against hadoop 23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5881) BuiltIn Gzip compressor decompressor not getting pooled, leading to native memory leak

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262790#comment-13262790
 ] 

stack commented on HBASE-5881:
--

Is the existance of this JIRA sufficient documentation until above is fixed?

 BuiltIn Gzip compressor  decompressor not getting pooled, leading to native 
 memory leak
 

 Key: HBASE-5881
 URL: https://issues.apache.org/jira/browse/HBASE-5881
 Project: HBase
  Issue Type: Bug
  Components: io
Affects Versions: 0.92.1
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.92.2, 0.96.0, 0.94.1


 This issue will occur only in hadoop 23.x  above/
 In hadoop 0.20.x
 {code}
 public static void returnDecompressor(Decompressor decompressor) {
 if (decompressor == null) {
   return;
 }
 decompressor.reset();
 payback(decompressorPool, decompressor);
   }
 {code}
 In hadoop 0.23.x
 {code}
   public static void returnDecompressor(Decompressor decompressor) {
 if (decompressor == null) {
   return;
 }
 // if the decompressor can't be reused, don't pool it.
 if (decompressor.getClass().isAnnotationPresent(DoNotPool.class)) {
   return;
 }
 decompressor.reset();
 payback(decompressorPool, decompressor);
   }
 {code}
 Here annotation has been added. By default this library will be loaded if 
 there are no native library.
 {code}
 @DoNotPool
 public class BuiltInGzipDecompressor
 {code}
 Due to this each time new compressor/decompressor will be loaded, this leads 
 to native memory leak.
 {noformat}
 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new decompressor [.gz]
 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new decompressor [.gz]
 2012-04-25 22:11:48,093 INFO org.apache.hadoop.io.compress.CodecPool: Got 
 brand-new decompressor [.gz]
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5672) TestLruBlockCache#testBackgroundEvictionThread fails occasionally

2012-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262789#comment-13262789
 ] 

Hudson commented on HBASE-5672:
---

Integrated in HBase-TRUNK #2817 (See 
[https://builds.apache.org/job/HBase-TRUNK/2817/])
HBASE-5672 TestLruBlockCache#testBackgroundEvictionThread fails 
occasionally (Revision 1330971)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java


 TestLruBlockCache#testBackgroundEvictionThread fails occasionally
 -

 Key: HBASE-5672
 URL: https://issues.apache.org/jira/browse/HBASE-5672
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-5672.patch, HBASE-5672v2.patch, HBASE-5672v3.patch


 We find TestLruBlockCache#testBackgroundEvictionThread fails occasionally.
 I think it's a problem of the test case.
 Because runEviction() only do evictionThread.evict():
 {code}
 public void evict() {
   synchronized(this) {
 this.notify(); // FindBugs NN_NAKED_NOTIFY
   }
 }
 {code}
 However when we call evictionThread.evict(), the evictionThread may haven't 
 been in run() in the TestLruBlockCache#testBackgroundEvictionThread.
 If we run the test many times, we could find failture easily.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5161) Compaction algorithm should prioritize reference files

2012-04-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5161:
-

Fix Version/s: (was: 0.94.0)
   0.94.1

 Compaction algorithm should prioritize reference files
 --

 Key: HBASE-5161
 URL: https://issues.apache.org/jira/browse/HBASE-5161
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Priority: Critical
 Fix For: 0.92.1, 0.94.1


 I got myself into a state where my table was un-splittable as long as the 
 insert load was coming in. Emergency flushes because of the low memory 
 barrier don't check the number of store files so it never blocks, to a point 
 where I had in one case 45 store files and the compactions were almost never 
 done on the reference files (had 15 of them, went down by one in 20 minutes). 
 Since you can't split regions with reference files, that region couldn't 
 split and was doomed to just get more store files until the load stopped.
 Marking this as a minor issue, what we really need is a better pushback 
 mechanism but not prioritizing reference files seems wrong.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262793#comment-13262793
 ] 

stack commented on HBASE-5829:
--

@Ted Make a new issue?

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-5862:
-

Attachment: HBASE-5862-4.patch

Patch with more comments.
Also added a return if reflection was un-successful as there is no need to try 
more.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262798#comment-13262798
 ] 

Zhihong Yu commented on HBASE-5829:
---

The latest patch is good to go.
Useless statement can be addressed elsewhere.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5829) Inconsistency between the regions map and the servers map in AssignmentManager

2012-04-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5829:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Assignee: Maryann Xue
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Applied to trunk.  Letting patch hang out in case someone wants to apply it to 
other branches.

I added you as a contributor Maryann and assigned you this issue (You can 
assign yourself issues going forward).  Thanks for the patch.

 Inconsistency between the regions map and the servers map in 
 AssignmentManager
 --

 Key: HBASE-5829
 URL: https://issues.apache.org/jira/browse/HBASE-5829
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.6, 0.92.1
Reporter: Maryann Xue
Assignee: Maryann Xue
 Fix For: 0.96.0

 Attachments: HBASE-5829-0.90.patch, HBASE-5829-trunk.patch


 There are occurrences in AM where this.servers is not kept consistent with 
 this.regions. This might cause balancer to offline a region from the RS that 
 already returned NotServingRegionException at a previous offline attempt.
 In AssignmentManager.unassign(HRegionInfo, boolean)
 try {
   // TODO: We should consider making this look more like it does for the
   // region open where we catch all throwables and never abort
   if (serverManager.sendRegionClose(server, state.getRegion(),
 versionOfClosingNode)) {
 LOG.debug(Sent CLOSE to  + server +  for region  +
   region.getRegionNameAsString());
 return;
   }
   // This never happens. Currently regionserver close always return true.
   LOG.warn(Server  + server +  region CLOSE RPC returned false for  +
 region.getRegionNameAsString());
 } catch (NotServingRegionException nsre) {
   LOG.info(Server  + server +  returned  + nsre +  for  +
 region.getRegionNameAsString());
   // Presume that master has stale data.  Presume remote side just split.
   // Presume that the split message when it comes in will fix up the 
 master's
   // in memory cluster state.
 } catch (Throwable t) {
   if (t instanceof RemoteException) {
 t = ((RemoteException)t).unwrapRemoteException();
 if (t instanceof NotServingRegionException) {
   if (checkIfRegionBelongsToDisabling(region)) {
 // Remove from the regionsinTransition map
 LOG.info(While trying to recover the table 
 + region.getTableNameAsString()
 +  to DISABLED state the region  + region
 +  was offlined but the table was in DISABLING state);
 synchronized (this.regionsInTransition) {
   this.regionsInTransition.remove(region.getEncodedName());
 }
 // Remove from the regionsMap
 synchronized (this.regions) {
   this.regions.remove(region);
 }
 deleteClosingOrClosedNode(region);
   }
 }
 // RS is already processing this region, only need to update the 
 timestamp
 if (t instanceof RegionAlreadyInTransitionException) {
   LOG.debug(update  + state +  the timestamp.);
   state.update(state.getState());
 }
   }
 In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, 
 boolean)
   synchronized (this.regions) {
 this.regions.put(plan.getRegionInfo(), plan.getDestination());
   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262802#comment-13262802
 ] 

Hadoop QA commented on HBASE-5862:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524470/HBASE-5862-4.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1658//console

This message is automatically generated.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5862:
-

  Resolution: Fixed
Release Note: Fixes the case where when a region closed, its metrics did 
not; they stayed up associated w/ old hosting server even though region may 
have moved elsewhere.
  Status: Resolved  (was: Patch Available)

Committed trunk and 0.94.  Thanks for the patch Elliott

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5801) [hbck] Hbck should handle case where some regions have different HTD settings in .regioninfo files (0.90 specific)

2012-04-26 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262807#comment-13262807
 ] 

Jimmy Xiang commented on HBASE-5801:


I looked into it and found out why deleteTable fails: one of the region is not 
closed.
I will put up another patch soon.

 [hbck] Hbck should handle case where some regions have different HTD settings 
 in .regioninfo files  (0.90 specific)
 ---

 Key: HBASE-5801
 URL: https://issues.apache.org/jira/browse/HBASE-5801
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.90.7
Reporter: Jonathan Hsieh
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase_5801_v2.patch


 Recently, we encountered a case where some regions in a table have different 
 HTableDescriptor settings serialized into HDFS their HRegionInfo .regioninfo 
 file.  hbck expects all HTDs within a table to be the same and currently 
 bails out in this situation.
 We need to either point out a proper set of actions for the user to execute 
 or automatically convert the region to a common HTD (likely the most common 
 on, or possibly the first one.)
 Not sure if this requires reformatting data but may require closing and 
 restarting a region.
 This issue is hbase 0.90.x specific -- 0.92+ keep all table info in a single 
 .tableinfo file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262812#comment-13262812
 ] 

Lars Hofhansl commented on HBASE-5864:
--

Going to commit in a few unless there're objections.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262810#comment-13262810
 ] 

Lars Hofhansl commented on HBASE-5862:
--

I don't get the Hadoop private field accessor stuff. Why do we need to clear 
out private fields? Is there an API missing for this in Hadoop?

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Zhihong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262808#comment-13262808
 ] 

Zhihong Yu commented on HBASE-5862:
---

{code}
+//per hfile.  Figuring out which cfs, hfiles, ...
{code}
Should cfs be in expanded form (column families) ?
{code}
+//and on the next tick of the metrics everything that is still relevant 
will be
+//re-added.
{code}
're-added' - 'added' or 'added again'

The initialization work in clear() should be moved to 
RegionServerDynamicMetrics ctor because it is one time operation.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262813#comment-13262813
 ] 

Elliott Clark commented on HBASE-5862:
--

@lars Yes there's a missing api.  Hadoop metrics keeps a copy of all metrics 
created.  That copy is used to expose the data to jmx and other consumers.
There is no remove function. HADOOP-8313 was filed to correct this.  However 
until that changes reflection was the only way.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5801) [hbck] Hbck should handle case where some regions have different HTD settings in .regioninfo files (0.90 specific)

2012-04-26 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262816#comment-13262816
 ] 

Jonathan Hsieh commented on HBASE-5801:
---

Thanks for following through Jimmy.  Adding more flakey tests is just going to 
cause more trouble down the line and it is better if we figure out and catch 
them before they get in!

 [hbck] Hbck should handle case where some regions have different HTD settings 
 in .regioninfo files  (0.90 specific)
 ---

 Key: HBASE-5801
 URL: https://issues.apache.org/jira/browse/HBASE-5801
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.90.7
Reporter: Jonathan Hsieh
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase_5801_v2.patch


 Recently, we encountered a case where some regions in a table have different 
 HTableDescriptor settings serialized into HDFS their HRegionInfo .regioninfo 
 file.  hbck expects all HTDs within a table to be the same and currently 
 bails out in this situation.
 We need to either point out a proper set of actions for the user to execute 
 or automatically convert the region to a common HTD (likely the most common 
 on, or possibly the first one.)
 Not sure if this requires reformatting data but may require closing and 
 restarting a region.
 This issue is hbase 0.90.x specific -- 0.92+ keep all table info in a single 
 .tableinfo file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5801) [hbck] Hbck should handle case where some regions have different HTD settings in .regioninfo files (0.90 specific)

2012-04-26 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5801:
---

Status: Open  (was: Patch Available)

 [hbck] Hbck should handle case where some regions have different HTD settings 
 in .regioninfo files  (0.90 specific)
 ---

 Key: HBASE-5801
 URL: https://issues.apache.org/jira/browse/HBASE-5801
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.90.7
Reporter: Jonathan Hsieh
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase_5801_v2.patch, hbase_5801_v3.patch


 Recently, we encountered a case where some regions in a table have different 
 HTableDescriptor settings serialized into HDFS their HRegionInfo .regioninfo 
 file.  hbck expects all HTDs within a table to be the same and currently 
 bails out in this situation.
 We need to either point out a proper set of actions for the user to execute 
 or automatically convert the region to a common HTD (likely the most common 
 on, or possibly the first one.)
 Not sure if this requires reformatting data but may require closing and 
 restarting a region.
 This issue is hbase 0.90.x specific -- 0.92+ keep all table info in a single 
 .tableinfo file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5801) [hbck] Hbck should handle case where some regions have different HTD settings in .regioninfo files (0.90 specific)

2012-04-26 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5801:
---

Status: Patch Available  (was: Open)

I ran TestHBaseFsck 10x with v3 patch and all passed.

 [hbck] Hbck should handle case where some regions have different HTD settings 
 in .regioninfo files  (0.90 specific)
 ---

 Key: HBASE-5801
 URL: https://issues.apache.org/jira/browse/HBASE-5801
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.90.7
Reporter: Jonathan Hsieh
Assignee: Jimmy Xiang
 Fix For: 0.90.7

 Attachments: hbase_5801_v2.patch, hbase_5801_v3.patch


 Recently, we encountered a case where some regions in a table have different 
 HTableDescriptor settings serialized into HDFS their HRegionInfo .regioninfo 
 file.  hbck expects all HTDs within a table to be the same and currently 
 bails out in this situation.
 We need to either point out a proper set of actions for the user to execute 
 or automatically convert the region to a common HTD (likely the most common 
 on, or possibly the first one.)
 Not sure if this requires reformatting data but may require closing and 
 restarting a region.
 This issue is hbase 0.90.x specific -- 0.92+ keep all table info in a single 
 .tableinfo file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5877) When a query fails because the region has moved, let the regionserver return the new address to the client

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262820#comment-13262820
 ] 

stack commented on HBASE-5877:
--

@N This patch will benefit any move, not just rolling restart, right?

Here are a quick couple of comments on the patch.

I like addition of RegionMovedException.

Convention is to capitalize static defines: '+  private static final String 
hostField = hostname=;' so it should be HOSTFIELD.

Its super ugly that you have to parse exception message to get exception data 
member fields... but thats not your fault.

Please keep the style of the surrounding code.  This kinda thing is 
unconventional:

{code}
+private void updateCachedLocations(
+  SetString updateHistory,
+  HRegionLocation hrl,
+  Object t) {
{code}

Ugh on how ugly it is updating cache.  Again, not your fault.

Ted suggests updating interface version.  Maybe don't.  If you do, you can't 
get this into a 0.94.1, etc.

I don't see changes to make use of this new functionality?  I'd expect the 
balancer in master to make use of it?

 When a query fails because the region has moved, let the regionserver return 
 the new address to the client
 --

 Key: HBASE-5877
 URL: https://issues.apache.org/jira/browse/HBASE-5877
 Project: HBase
  Issue Type: Improvement
  Components: client, master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5877.v1.patch


 This is mainly useful when we do a rolling restart. This will decrease the 
 load on the master and the network load.
 Note that a region is not immediately opened after a close. So:
 - it seems preferable to wait before retrying on the other server. An 
 optimisation would be to have an heuristic depending on when the region was 
 closed.
 - during a rolling restart, the server moves the regions then stops. So we 
 may have failures when the server is stopped, and this patch won't help.
 The implementation in the first patch does:
 - on the region move, there is an added parameter on the regionserver#close 
 to say where we are sending the region
 - the regionserver keeps a list of what was moved. Each entry is kept 100 
 seconds.
 - the regionserver sends a specific exception when it receives a query on a 
 moved region. This exception contains the new address.
 - the client analyses the exeptions and update its cache accordingly...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5885) Invalid HFile block magic on Local file System

2012-04-26 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-5885:


 Summary: Invalid HFile block magic on Local file System
 Key: HBASE-5885
 URL: https://issues.apache.org/jira/browse/HBASE-5885
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Elliott Clark


ERROR: java.lang.RuntimeException: 
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=7, exceptions:
Thu Apr 26 11:19:18 PDT 2012, 
org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for reader 
reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268,
 compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
[cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] 
[cacheEvictOnClose=false] [cacheCompressed=false], 
firstKey=01/info:data/1335463981520/Put, 
lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, avgValueLen=1000, 
entries=1215085, length=1264354417, 
cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0]
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
Caused by: java.io.IOException: Invalid HFile block magic: 
\xEC\xD5\x9D\xB4\xC2bfo
at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153)
at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
... 12 more

Thu Apr 26 11:19:19 PDT 2012, 
org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
java.io.IOException: java.lang.IllegalArgumentException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:216)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:630)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:406)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
at 

[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System

2012-04-26 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262828#comment-13262828
 ] 

Elliott Clark commented on HBASE-5885:
--

Forgot to add that I also tried this with and without HBASE-5864

 Invalid HFile block magic on Local file System
 --

 Key: HBASE-5885
 URL: https://issues.apache.org/jira/browse/HBASE-5885
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Elliott Clark

 ERROR: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
 attempts=7, exceptions:
 Thu Apr 26 11:19:18 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for 
 reader 
 reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268,
  compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
 [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] 
 [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], 
 firstKey=01/info:data/1335463981520/Put, 
 lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, 
 avgValueLen=1000, entries=1215085, length=1264354417, 
 cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0]
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.io.IOException: Invalid HFile block magic: 
 \xEC\xD5\x9D\xB4\xC2bfo
   at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153)
   at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   ... 12 more
 Thu Apr 26 11:19:19 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: java.lang.IllegalArgumentException
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.lang.IllegalArgumentException
   at java.nio.Buffer.position(Buffer.java:216)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:630)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 

[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System

2012-04-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262837#comment-13262837
 ] 

stack commented on HBASE-5885:
--

What if you disable this setting?   Does it still fail?

{code}
+  /** 
+   * If this parameter is set to true, then hbase will read
+   * data and then verify checksums. Checksum verification 
+   * inside hdfs will be switched off.  However, if the hbase-checksum 
+   * verification fails, then it will switch back to using
+   * hdfs checksums for verifiying data that is being read from storage.
+   *
+   * If this parameter is set to false, then hbase will not
+   * verify any checksums, instead it will depend on checksum verification
+   * being done in the hdfs client.
+   */
+  public static final String HBASE_CHECKSUM_VERIFICATION = 
+  hbase.regionserver.checksum.verify;
{code}

 Invalid HFile block magic on Local file System
 --

 Key: HBASE-5885
 URL: https://issues.apache.org/jira/browse/HBASE-5885
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Elliott Clark

 ERROR: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
 attempts=7, exceptions:
 Thu Apr 26 11:19:18 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for 
 reader 
 reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268,
  compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
 [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] 
 [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], 
 firstKey=01/info:data/1335463981520/Put, 
 lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, 
 avgValueLen=1000, entries=1215085, length=1264354417, 
 cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0]
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.io.IOException: Invalid HFile block magic: 
 \xEC\xD5\x9D\xB4\xC2bfo
   at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153)
   at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   ... 12 more
 Thu Apr 26 11:19:19 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: java.lang.IllegalArgumentException
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at 

[jira] [Updated] (HBASE-5879) Enable JMX metrics collection for the Thrift proxy

2012-04-26 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5879:
---

Attachment: D2955.1.patch

mbautin requested code review of [jira] [HBASE-5879] [89-fb] Enable JMX 
metrics collection for the Thrift proxy.
Reviewers: Kannan, Liyin, sc, tedyu, JIRA

  We need to enable JMX on the Thrift proxy on a separate port different from 
the JMX port used by regionserver. This is necessary for metrics collection.

TEST PLAN
  - Deploy to dev cluster.
  - Verify that it is possible to collect metrics through JMX from the Thrift 
proxy.

REVISION DETAIL
  https://reviews.facebook.net/D2955

AFFECTED FILES
  bin/hbase-config.sh

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/6735/

Tip: use the X-Herald-Rules header to filter Herald messages in your client.


 Enable JMX metrics collection for the Thrift proxy
 --

 Key: HBASE-5879
 URL: https://issues.apache.org/jira/browse/HBASE-5879
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D2955.1.patch


 We need to enable JMX on the Thrift proxy on a separate port different from 
 the JMX port used by regionserver. This is necessary for metrics collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5879) Enable JMX metrics collection for the Thrift proxy

2012-04-26 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262841#comment-13262841
 ] 

Phabricator commented on HBASE-5879:


Kannan has accepted the revision [jira] [HBASE-5879] [89-fb] Enable JMX 
metrics collection for the Thrift proxy.

REVISION DETAIL
  https://reviews.facebook.net/D2955

BRANCH
  enable_jmx_metrics_collection_for_the_thrift_HBASE-5879


 Enable JMX metrics collection for the Thrift proxy
 --

 Key: HBASE-5879
 URL: https://issues.apache.org/jira/browse/HBASE-5879
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D2955.1.patch


 We need to enable JMX on the Thrift proxy on a separate port different from 
 the JMX port used by regionserver. This is necessary for metrics collection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5885) Invalid HFile block magic on Local file System

2012-04-26 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262880#comment-13262880
 ] 

Elliott Clark commented on HBASE-5885:
--

I don't get an exception when that is set to false.

 Invalid HFile block magic on Local file System
 --

 Key: HBASE-5885
 URL: https://issues.apache.org/jira/browse/HBASE-5885
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.96.0
Reporter: Elliott Clark

 ERROR: java.lang.RuntimeException: 
 org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
 attempts=7, exceptions:
 Thu Apr 26 11:19:18 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for 
 reader 
 reader=file:/tmp/hbase-eclark/hbase/TestTable/e2d1c846363c75262cbfd85ea278b342/info/bae2681d63734066957b58fe791a0268,
  compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] 
 [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] 
 [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false], 
 firstKey=01/info:data/1335463981520/Put, 
 lastKey=0002588100/info:data/1335463902296/Put, avgKeyLen=30, 
 avgValueLen=1000, entries=1215085, length=1264354417, 
 cur=000248/info:data/1335463994457/Put/vlen=1000/ts=0]
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:135)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:368)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3323)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3279)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3296)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2393)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.io.IOException: Invalid HFile block magic: 
 \xEC\xD5\x9D\xB4\xC2bfo
   at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:153)
   at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:164)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock.init(HFileBlock.java:254)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockDataInternal(HFileBlock.java:1779)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1637)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:327)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:555)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:651)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   ... 12 more
 Thu Apr 26 11:19:19 PDT 2012, 
 org.apache.hadoop.hbase.client.ScannerCallable@190a621a, java.io.IOException: 
 java.io.IOException: java.lang.IllegalArgumentException
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1132)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1121)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2420)
   at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1376)
 Caused by: java.lang.IllegalArgumentException
   at java.nio.Buffer.position(Buffer.java:216)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:630)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:130)
   at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:95)
   at 
 

[jira] [Updated] (HBASE-5851) TestProcessBasedCluster sometimes fails; currently disabled -- needs to be fixed and reenabled

2012-04-26 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-5851:
---

Status: Open  (was: Patch Available)

Agree with Stack, still flaky, will look into it more later.

 TestProcessBasedCluster sometimes fails; currently disabled -- needs to be 
 fixed and reenabled
 --

 Key: HBASE-5851
 URL: https://issues.apache.org/jira/browse/HBASE-5851
 Project: HBase
  Issue Type: Test
Reporter: Zhihong Yu
Assignee: Jimmy Xiang
 Attachments: disable.txt, hbase-5851.patch, hbase-5851_v2.patch, 
 metahang.txt, zkfail.txt


 TestProcessBasedCluster failed in 
 https://builds.apache.org/job/HBase-TRUNK-security/178
 Looks like cluster failed to start:
 {code}
 2012-04-21 14:22:32,666 INFO  [Thread-1] 
 util.ProcessBasedLocalHBaseCluster(176): Waiting for HBase to startup. 
 Retries left: 2
 java.io.IOException: Giving up trying to location region in meta: thread is 
 interrupted.
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1173)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:956)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:917)
   at org.apache.hadoop.hbase.client.HTable.finishSetup(HTable.java:252)
   at org.apache.hadoop.hbase.client.HTable.init(HTable.java:192)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:174)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
   at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
   at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
   at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
   at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 java.lang.InterruptedException: sleep interrupted at 
 java.lang.Thread.sleep(Native Method)
   at org.apache.hadoop.hbase.util.Threads.sleep(Threads.java:134)
   at 
 org.apache.hadoop.hbase.util.ProcessBasedLocalHBaseCluster.startHBase(ProcessBasedLocalHBaseCluster.java:178)
   at 
 org.apache.hadoop.hbase.util.TestProcessBasedCluster.testProcessBasedCluster(TestProcessBasedCluster.java:56)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

2012-04-26 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-5886:
--

 Summary: Add new metric for possible data loss due to puts without 
WAL 
 Key: HBASE-5886
 URL: https://issues.apache.org/jira/browse/HBASE-5886
 Project: HBase
  Issue Type: New Feature
  Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor


Add a metrics to keep track of puts without WAL and possible data loss size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

2012-04-26 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5886:
---

Attachment: HBASE-5886-v0.patch

 Add new metric for possible data loss due to puts without WAL 
 --

 Key: HBASE-5886
 URL: https://issues.apache.org/jira/browse/HBASE-5886
 Project: HBase
  Issue Type: New Feature
  Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: metrics
 Attachments: HBASE-5886-v0.patch


 Add a metrics to keep track of puts without WAL and possible data loss size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

2012-04-26 Thread Matteo Bertozzi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-5886:
---

Status: Patch Available  (was: Open)

 Add new metric for possible data loss due to puts without WAL 
 --

 Key: HBASE-5886
 URL: https://issues.apache.org/jira/browse/HBASE-5886
 Project: HBase
  Issue Type: New Feature
  Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: metrics
 Attachments: HBASE-5886-v0.patch


 Add a metrics to keep track of puts without WAL and possible data loss size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

2012-04-26 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262994#comment-13262994
 ] 

Todd Lipcon commented on HBASE-5886:


I think the metrics names could be improved here -- putWithoutWAL could be 
numPutsWithoutWAL and possibleDataLossSize perhaps mbInMemoryWithoutWAL?

- in recordPossibleDataLossNoWal, use {{getAndIncrement}} instead of 
{{incrementAndGet}}, and compare the result to 0 -- otherwise you're doing an 
extra atomic op and have a potential race
- the warning message should include the client IP address as well as the 
region ID. Perhaps something like Client 123.123.123.123 writing data to 
region abcdef12345 with WAL disabled. Data may be lost in the event of a crash. 
Will not log further warnings for this region.

- the debug message when setting possibleDataLossSize back to 0 seems 
unnecessary.

- DEFAULT_WARN_NO_WAL_INTERVAL is unused


 Add new metric for possible data loss due to puts without WAL 
 --

 Key: HBASE-5886
 URL: https://issues.apache.org/jira/browse/HBASE-5886
 Project: HBase
  Issue Type: New Feature
  Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: metrics
 Attachments: HBASE-5886-v0.patch


 Add a metrics to keep track of puts without WAL and possible data loss size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5864:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96.
Thanks for the Ram.
Thanks for reviews Dhruba and Ted.

 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5737) Minor Improvements related to balancer.

2012-04-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5737:
-

Issue Type: Improvement  (was: Bug)

 Minor Improvements related to balancer.
 ---

 Key: HBASE-5737
 URL: https://issues.apache.org/jira/browse/HBASE-5737
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
Priority: Minor
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5737.patch, HBASE-5737_1.patch, 
 HBASE-5737_2.patch, HBASE-5737_3.patch


 Currently in Am.getAssignmentByTable()  we use a result map which is currenly 
 a hashmap.  It could be better if we have a treeMap.  Even in 
 MetaReader.fullScan we have the treeMap only so that we have the naming order 
 maintained. I felt this change could be very useful in cases where we are 
 extending the DefaultLoadBalancer.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5866) Canary in tool package but says its in tools.

2012-04-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5866:
-

Issue Type: New Feature  (was: Bug)

 Canary in tool package but says its in tools.
 -

 Key: HBASE-5866
 URL: https://issues.apache.org/jira/browse/HBASE-5866
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: stack
 Fix For: 0.94.0, 0.96.0

 Attachments: 5866.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5866) Canary in tool package but says its in tools.

2012-04-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5866:
-

Issue Type: Bug  (was: New Feature)

 Canary in tool package but says its in tools.
 -

 Key: HBASE-5866
 URL: https://issues.apache.org/jira/browse/HBASE-5866
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0, 0.96.0

 Attachments: 5866.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263094#comment-13263094
 ] 

Lars Hofhansl commented on HBASE-5862:
--

Thanks Elliot.

 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5864) Error while reading from hfile in 0.94

2012-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263100#comment-13263100
 ] 

Hudson commented on HBASE-5864:
---

Integrated in HBase-0.94 #151 (See 
[https://builds.apache.org/job/HBase-0.94/151/])
HBASE-5864 Error while reading from hfile in 0.94 (Ram) (Revision 1331057)

 Result = ABORTED
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileWriterV2.java


 Error while reading from hfile in 0.94
 --

 Key: HBASE-5864
 URL: https://issues.apache.org/jira/browse/HBASE-5864
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5864_1.patch, HBASE-5864_2.patch, 
 HBASE-5864_3.patch, HBASE-5864_test.patch


 Got the following stacktrace during region split.
 {noformat}
 2012-04-24 16:05:42,168 WARN org.apache.hadoop.hbase.regionserver.Store: 
 Failed getting store size for value
 java.io.IOException: Requested block is out of range: 2906737606134037404, 
 lastDataBlockOffset: 84764558
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:278)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.midkey(HFileBlockIndex.java:285)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV2.midkey(HFileReaderV2.java:402)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Reader.midkey(StoreFile.java:1638)
   at 
 org.apache.hadoop.hbase.regionserver.Store.getSplitPoint(Store.java:1943)
   at 
 org.apache.hadoop.hbase.regionserver.RegionSplitPolicy.getSplitPoint(RegionSplitPolicy.java:77)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkSplit(HRegion.java:4921)
   at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2901)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5862) After Region Close remove the Operation Metrics.

2012-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263099#comment-13263099
 ] 

Hudson commented on HBASE-5862:
---

Integrated in HBase-0.94 #151 (See 
[https://builds.apache.org/job/HBase-0.94/151/])
HBASE-5862 After Region Close remove the Operation Metrics; ADDENDUM -- 
missing import (Revision 1331040)

 Result = ABORTED
stack : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java


 After Region Close remove the Operation Metrics.
 

 Key: HBASE-5862
 URL: https://issues.apache.org/jira/browse/HBASE-5862
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Critical
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5862-0.patch, HBASE-5862-1.patch, 
 HBASE-5862-2.patch, HBASE-5862-3.patch, HBASE-5862-4.patch, 
 HBASE-5862-94-3.patch, TSD.png


 If a region is closed then Hadoop metrics shouldn't still be reporting about 
 that region.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5886) Add new metric for possible data loss due to puts without WAL

2012-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13263111#comment-13263111
 ] 

Hadoop QA commented on HBASE-5886:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12524562/HBASE-5886-v0.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 2 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.TestHeapSize

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1659//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1659//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/1659//console

This message is automatically generated.

 Add new metric for possible data loss due to puts without WAL 
 --

 Key: HBASE-5886
 URL: https://issues.apache.org/jira/browse/HBASE-5886
 Project: HBase
  Issue Type: New Feature
  Components: metrics, regionserver
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
  Labels: metrics
 Attachments: HBASE-5886-v0.patch


 Add a metrics to keep track of puts without WAL and possible data loss size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >