from:"Lars Hofhansl \\\(Commented\\\) \\\(JIRA\\\)"

[jira] [Commented] (HBASE-5635) If getTaskList() returns null splitlogWorker is down. It wont serve any requests.

2012-04-20 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258414#comment-13258414
 ] 

Lars Hofhansl commented on HBASE-5635:
--

+1 on patch. I wonder how many more problems like this we have lurking in the 
worker threads.
Can you do a 0.94.1 patch as well?

 If getTaskList() returns null splitlogWorker is down. It wont serve any 
 requests. 
 --

 Key: HBASE-5635
 URL: https://issues.apache.org/jira/browse/HBASE-5635
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.1
Reporter: Kristam Subba Swathi
 Attachments: HBASE-5635.1.patch, HBASE-5635.2.patch, HBASE-5635.patch


 During the hlog split operation if all the zookeepers are down ,then the 
 paths will be returned as null and the splitworker thread wil be exited
 Now this regionserver wil not be able to acquire any other tasks since the 
 splitworker thread is exited
 Please find the attached code for more details
 {code}
 private ListString getTaskList() {
 for (int i = 0; i  zkretries; i++) {
   try {
 return (ZKUtil.listChildrenAndWatchForNewChildren(this.watcher,
 this.watcher.splitLogZNode));
   } catch (KeeperException e) {
 LOG.warn(Could not get children of znode  +
 this.watcher.splitLogZNode, e);
 try {
   Thread.sleep(1000);
 } catch (InterruptedException e1) {
   LOG.warn(Interrupted while trying to get task list ..., e1);
   Thread.currentThread().interrupt();
   return null;
 }
   }
 }
 {code}
 in the org.apache.hadoop.hbase.regionserver.SplitLogWorker 
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode

2012-04-19 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257818#comment-13257818
]

Lars Hofhansl commented on HBASE-5547:
--

For this I was actually thinking a single cluster-wide setting (This HBase
cluster is now in backup mode), which would affect all tables would be
sufficient.

But could do per table as well as Jesse has done and it definitely does not
need to be a custom znode, checking a value is fine.

Don't delete HFiles when in backup mode
-

Key: HBASE-5547
URL: https://issues.apache.org/jira/browse/HBASE-5547
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates

This came up in a discussion I had with Stack.
It would be nice if HBase could be notified that a backup is in progress (via
a znode for example) and in that case either:
1. rename HFiles to be delete to file.bck
2. rename the HFiles into a special directory
3. rename them to a general trash directory (which would not need to be tied
to backup mode).
That way it should be able to get a consistent backup based on HFiles (HDFS
snapshots or hard links would be better options here, but we do not have
those).
#1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256756#comment-13256756
]

Lars Hofhansl commented on HBASE-5782:
--

For some reason I cannot run TestHLog locally. I always get:
{code}
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
{code}

I can run all other tests that I tried (including some others marked with
@LargeTests). Verified again with HLogPerformanceEvaluation manually.
Should we put HBASE-5792 in 0.94 as well, so that we use this test?

Edits can be appended out of seqid order since HBASE-4487
-

Key: HBASE-5782
URL: https://issues.apache.org/jira/browse/HBASE-5782
Project: HBase
Issue Type: Bug
Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: Lars Hofhansl
Priority: Blocker
Fix For: 0.94.0

Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt,
5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch,
hbase-5782.txt

Create a table with 1000 splits, after the region assignemnt, kill the
regionserver wich contains META table.
Here few regions are missing after the log splitting and region assigment.
HBCK report shows multiple region holes are got created.
Same scenario was verified mulitple times in 0.92.1, no issues.

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256757#comment-13256757
 ] 

Lars Hofhansl commented on HBASE-5545:
--

@Ram: yes that is what I meant to say (recursive is not needed, and somebody 
might misunderstand later and use this to delete tmp directory). Not important.

+1 on commit as is.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ]

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256933#comment-13256933
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Now note, that TestHLog still does not run anything locally (neither in 0.94 or 
trunk), there is something with that specific test it seems.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
 hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256938#comment-13256938
 ] 

Lars Hofhansl commented on HBASE-5545:
--

I'm going to commit this 0.94 and 0.96 in the next few minutes.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call:

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257105#comment-13257105
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Yes.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
 hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in backup mode

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257114#comment-13257114
]

Lars Hofhansl commented on HBASE-5547:
--

What I had in mind when I filed this was something quite simple: A single znode
that all RegionServers would check when deleting any HFile and then based on
existence or value of that znode either delete the HFile or rename it. Of
things are never quite as simple...

To get a guaranteed consistent snapshot the RegionServers need to check for the
znode's value synchronously in the delete path (or at least I see no other
way). Otherwise there are times when the RegionServers do not agree and some
files will be deleted and some will be backed up with no possibility for the
client to know exactly as of when the backup would be consistent.

Since HFiles are deleted as result of a compaction in an asynchronous thread,
synchronously checking the znode should not cause performance issues, unless we
fear we'll overload ZK.

This simple solution would add special code for just this scenario, which is
bad. At the same time it would be relatively simple (famous last words), so
that's something to weigh.

Don't delete HFiles when in backup mode
-

Key: HBASE-5547
URL: https://issues.apache.org/jira/browse/HBASE-5547
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Jesse Yates

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255714#comment-13255714
 ] 

Lars Hofhansl commented on HBASE-5545:
--

If this gets done before HBASE-5782 I'll pull it in, otherwise I don't think I 
should hold up the release for this. Sounds good?


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.1


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255770#comment-13255770
]

Lars Hofhansl commented on HBASE-5782:
--

Yep. It's crucial that syncTillHere is updated last.
The pendingWrites are appended strictly in order, so there is a very short race
that might lead to sync be issued multiple time when only one was necessary (it
seems the same race condition existed before).
So I think it is safe and low risk.

The question now is: Do this for 0.94 and then a more elaborate rewrite in
trunk, or do a more risky rewrite for 0.94?

Edits can be appended out of seqid order since HBASE-4487
-

Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt,
5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255779#comment-13255779
 ] 

Lars Hofhansl commented on HBASE-5545:
--

+1 on patch. I assume there no strange race conditions in this part of the code.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public abstract

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255828#comment-13255828
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Anybody opposed to do my patch for 0.94 and do a rewrite in trunk? Todd? Stack? 
Ted?


 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255911#comment-13255911
]

Lars Hofhansl commented on HBASE-5782:
--

@Stack: the doneUpTo is fine, because it is only used to set syncedTillHere. We
won't write more into the log (once we take the pendingWrites they are gone),
but we might sync too much unnecessarily.
Will do the (performance) testing now. I don't have a cluster at my disposal
atm, so I'll do it with a local HDFS instance.

If we'd rather finish Todd's for 0.94 that'd be nice.

Edits can be appended out of seqid order since HBASE-4487
-

Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt,
5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255939#comment-13255939
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Ah, I see. Yes, doneUpto line should be pulled into the synchronized block.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255974#comment-13255974
]

Lars Hofhansl commented on HBASE-5782:
--

@Stack: I ran it too. It works fine. The interesting thing I find that it is
faster *with* the patch! And that scares me.
I ran with 100 threads, 1 iterations. With the patch it took 29s, without
it took 43s. This is on a 6 core machine with hyper threading.

Now, the HLogPerformanceEvaluation does not work with -nosync and -verify,
right? (presumably because no final sync is issued).

Wouldn't mind to get some other numbers from you as well if you had some time.

Edits can be appended out of seqid order since HBASE-4487
-

Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt,
5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256058#comment-13256058
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Thanks Stack!

I'll run a test with fewer threads too, just to make sure, the fact that both 
of our patches are faster probably means that we did a lot of unnecessary 
sync'ing before...? 20% performance increase is pretty damn awesome.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256072#comment-13256072
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Did some more tests with HLogPerformanceEvaluation (and my patch):

10 threads, 10 iterations: with patch: 42s, without patch: 41s
5 threads, 20 iterations:  with patch: 46s, without patch: 44s
2 threads, 20 iterations:  with patch: 46s, without patch: 44s

So for fewer threads my patch is slightly slower.

So... What do we do? :)

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256083#comment-13256083
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Thanks Clint. No harm, I just think it will hide complexity that should not be 
hidden. I also don't think it's too much to ask a user to create the table 
first (but we should definitely fix the Javadoc in that case).

That all said, I am +-0 on this. It's a simple change, and v3 looks good.

 ImportTsv does not check for table existence 
 -

 Key: HBASE-5741
 URL: https://issues.apache.org/jira/browse/HBASE-5741
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Clint Heath
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.1

 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
 HBase-5741.patch


 The usage statement for the importtsv command to hbase claims this:
 Note: if you do not use this option, then the target table must already 
 exist in HBase (in reference to the importtsv.bulk.output command-line 
 option)
 The truth is, the table must exist no matter what, importtsv cannot and will 
 not create it for you.
 This is the case because the createSubmittableJob method of ImportTsv does 
 not even attempt to check if the table exists already, much less create it:
 (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
 305 HTable table = new HTable(conf, tableName);
 The HTable method signature in use there assumes the table exists and runs a 
 meta scan on it:
 (From org.apache.hadoop.hbase.client.HTable.java)
 142 * Creates an object to access a HBase table.
 ...
 151 public HTable(Configuration conf, final String tableName)
 What we should do inside of createSubmittableJob is something similar to what 
 the completebulkloads command would do:
 (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
 690 boolean tableExists = this.doesTableExist(tableName);
 691 if (!tableExists) this.createTable(tableName,dirPath);
 Currently the docs are misleading, the table in fact must exist prior to 
 running importtsv. We should check if it exists rather than assume it's 
 already there and throw the below exception:
 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
 Encountered problems when prefetch META table: 
 org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
 table: myTable2, row=myTable2,,99
   at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256094#comment-13256094
 ] 

Lars Hofhansl commented on HBASE-5782:
--

I'm +1 on committing on my patch for 0.94.0. We can then either revisit for 
0.94.1+ or 0.96.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256198#comment-13256198
 ] 

Lars Hofhansl commented on HBASE-5545:
--

@Ram: Isn't this a pretty remote corner case?
If we feel strongly that this should go into 0.94.0, let's get it in.


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256200#comment-13256200
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Yes, we need a test for this. What scared me most about this bug was that no 
test caught it, and this one of *the* core parts of HBase.

This test would basically just call code from HLogPerformanceEvaluation, right?


 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256202#comment-13256202
 ] 

Lars Hofhansl commented on HBASE-5782:
--

I can work on a test, unless you like to Stack, or maybe you have started on it 
anyway.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256206#comment-13256206
 ] 

Lars Hofhansl commented on HBASE-5545:
--

I see. Yep, that is more likely to happen than a DN crash (I think).
No hurry Ram :)


 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 [2012-03-07 20:51:45,858] [WARN ] 
 [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
 [com.huawei.isap.ump.ha.client.RPCRetryAndSwitchInvoker 131] Retrying the 
 method call: public

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256210#comment-13256210
 ] 

Lars Hofhansl commented on HBASE-5545:
--

2nd patch looks good. I don't think you needed to the add the FSUtil code, but 
it cannot hurt.
fs.delete does not throw if the file does not exist, right?
Do you want to set recursive to false (just in case somebody changes this 
around ends up pointing to a directory)... This is a super minor nit.

+1 on commit if delete does not throw on non-existant file.

 region can't be opened for a long time. Because the creating File failed.
 -

 Key: HBASE-5545
 URL: https://issues.apache.org/jira/browse/HBASE-5545
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.6
Reporter: gaojinchao
Assignee: gaojinchao
 Fix For: 0.90.7, 0.92.2, 0.94.0

 Attachments: HBASE-5545.patch, HBASE-5545.patch


 Scenario:
 
 1. File is created 
 2. But while writing data, all datanodes might have crashed. So writing data 
 will fail.
 3. Now even if close is called in finally block, close also will fail and 
 throw the Exception because writing data failed.
 4. After this if RS try to create the same file again, then 
 AlreadyBeingCreatedException will come.
 Suggestion to handle this scenario.
 ---
 1. Check for the existence of the file, if exists delete the file and create 
 new file. 
 Here delete call for the file will not check whether the file is open or 
 closed.
 Overwrite Option:
 
 1. Overwrite option will be applicable if you are trying to overwrite a 
 closed file.
 2. If the file is not closed, then even with overwrite option Same 
 AlreadyBeingCreatedException will be thrown.
 This is the expected behaviour to avoid the Multiple clients writing to same 
 file.
 Region server logs:
 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
 create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
 for 
 DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
 on client 158.1.132.19 because current leaseholder is trying to recreate file.
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
 at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
 at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
 at org.apache.hadoop.ipc.Client.call(Client.java:961)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
 at $Proxy6.create(Unknown Source)
 at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at $Proxy6.create(Unknown Source)
 at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.init(DFSClient.java:3643)
 at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
 at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
 at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
 at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
 at 
 org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
 at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254864#comment-13254864
 ] 

Lars Hofhansl commented on HBASE-5795:
--

+1 on v2, are you integrating v2 with Stacks test?

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0, 0.96.0

 Attachments: 5795-v2.txt, 5795.unittest.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254925#comment-13254925
 ] 

Lars Hofhansl commented on HBASE-5782:
--

So the problem is that logSyncerThread keeps the edit in order but the syncer 
then applies the pending batches potentially out of order?

We might just need a sync lock to prevent two threads syncing at the same 
time. This is different from the update lock, which also prevents writing any 
edits.

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255094#comment-13255094
 ] 

Lars Hofhansl commented on HBASE-5782:
--

The short term choices we have are:
# revert HBASE-4528, HBASE-4487, and HBASE-5541 (are there others?)
# Partially revert HBASE-2467 (or devise other ways to have strictly one thread 
flushing an HLog).

Maybe Todd as the author of HBASE-2467 could chime in... Todd?

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255168#comment-13255168
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Is this really only a problem because of HBASE-4528 and HBASE-4487?
It seems even without HLog.appendNoSync() is is possible that one threads 
flushes an entire batch of pending write before an thread that started earlier 
can get to it.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782.txt, HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5780) Fix race in HBase regionserver startup vs ZK SASL authentication

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255170#comment-13255170
 ] 

Lars Hofhansl commented on HBASE-5780:
--

testResetZooKeeperSession failed in 0.94 again. :(

 Fix race in HBase regionserver startup vs ZK SASL authentication
 

 Key: HBASE-5780
 URL: https://issues.apache.org/jira/browse/HBASE-5780
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.92.1, 0.94.0
Reporter: Shaneal Manek
Assignee: Shaneal Manek
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5780-v2.patch, HBASE-5780.patch, 
 TestReplicationPeer-Security-output.log, TestReplicationPeer-output.log, 
 testoutput.tar.gz


 Secure RegionServers sometimes fail to start with the following backtrace:
 2012-03-22 17:20:16,737 FATAL 
 org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
 centos60-20.ent.cloudera.com,60020,1332462015929: Unexpected exception during 
 initialization, aborting
 org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = 
 NoAuth for /hbase/shutdown
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:295)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:518)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:494)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:569)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:532)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:634)
 at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255191#comment-13255191
]

Lars Hofhansl commented on HBASE-5782:
--

Sketched patch looks good, making sure that we do sync' a batch of write before
the previous batch is done.

I was thinking about something similar, but still allowing multiple threads to
write, just that a thread cannot start writing before the previous batch is
confirmed sync'ed. I guess we actually wouldn't get more parallelism out that
than with your approach.

Edits can be appended out of seqid order since HBASE-4487
-

Attachments: 5782-sketch.txt, 5782.txt, HBASE-5782.patch

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255214#comment-13255214
]

Lars Hofhansl commented on HBASE-5778:
--

If we could tail the logs it would work. We just cannot seek into an HLog in
the middle and start reading from it.

Turn on WAL compression by default
--

Key: HBASE-5778
URL: https://issues.apache.org/jira/browse/HBASE-5778
Project: HBase
Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Lars Hofhansl
Priority: Blocker
Fix For: 0.96.0, 0.94.1

Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch

I ran some tests to verify if WAL compression should be turned on by default.
For a use case where it's not very useful (values two order of magnitude
bigger than the keys), the insert time wasn't different and the CPU usage 15%
higher (150% CPU usage VS 130% when not compressing the WAL).
When values are smaller than the keys, I saw a 38% improvement for the insert
run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure
WAL compression accounts for all the additional CPU usage, it might just be
that we're able to insert faster and we spend more time in the MemStore per
second (because our MemStores are bad when they contain tens of thousands of
values).
Those are two extremes, but it shows that for the price of some CPU we can
save a lot. My machines have 2 quads with HT, so I still had a lot of idle
CPUs.

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255230#comment-13255230
]

Lars Hofhansl commented on HBASE-5778:
--

Hmm... Then that does not explain what I saw. I saw the ReplicationSource
trying to read from a position in the file (indicated by ZK) and then the read
failing because the dictionary was not built up.

Turn on WAL compression by default
--

Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255308#comment-13255308
 ] 

Lars Hofhansl commented on HBASE-5782:
--

@Ram: It looks like your patch should work for this scenario, but I'd be 
generally wary about edits not being in order in the WAL.

 Edits can be appended out of seqid order since HBASE-4487
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, 
 HBASE-5782.patch


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254414#comment-13254414
 ] 

Lars Hofhansl commented on HBASE-5795:
--

+1 om patch.

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5794.txt, 5795-v1.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254429#comment-13254429
 ] 

Lars Hofhansl commented on HBASE-5795:
--

@Stack: I think you forgot to include the actual test in the patch :)

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5795-v1.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254430#comment-13254430
 ] 

Lars Hofhansl commented on HBASE-5795:
--

Oh, you attached your 5794 patch to this issue. I removed to avoid confusing, 
when you get a chance could you attach your patch here? Then we can make a 
combined patch with that and Ted's fix.

 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5795-v1.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254433#comment-13254433
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Thanks Ram!

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
Priority: Critical
 Fix For: 0.94.0


 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5796) Fix our abuse of IOE: see http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254470#comment-13254470
 ] 

Lars Hofhansl commented on HBASE-5796:
--

Could be as simple as a better hierarchy of classes extending IOException (like 
DoNotRetryException).

 Fix our abuse of IOE: see 
 http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html
 ---

 Key: HBASE-5796
 URL: https://issues.apache.org/jira/browse/HBASE-5796
 Project: HBase
  Issue Type: Task
Reporter: stack

 Lets make more context particular exceptions rather than throw IOEs 
 everywhere.  See Benoît's rant: 
 http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92-0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254510#comment-13254510
 ] 

Lars Hofhansl commented on HBASE-5795:
--

@Ted: I don't understand why that redundant write in 0.94 causes any problem. 
Can you elaborate? Was there an other problem in v1?


 hbase-3927 breaks 0.92-0.94 compatibility
 ---

 Key: HBASE-5795
 URL: https://issues.apache.org/jira/browse/HBASE-5795
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
 Fix For: 0.94.0

 Attachments: 5795-v2.txt, 5795.unittest.txt


 This commit broke our 0.92/0.94 compatibility:
 {code}
 
 r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
 HBASE-3927 display total uncompressed byte size of a region in web UI
 {code}
 I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
 cluster and rather than just digest version 1 of the HServerLoad, I get this:
 {code}
 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.4.14.38
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: A record version mismatch occured. Expecting v2, found v1
 at 
 org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
 at 
 org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
 at 
 org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
 ... 9 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254204#comment-13254204
]

Lars Hofhansl commented on HBASE-5778:
--

@Ted: Unfortunately it is not as simple as that. As I tried to explain above,
the ReplicationSource reads from the WAL files at offsets that are stored in
ZK. This does not work any longer, as you can no longer start reading the WAL
at an offset. The files need to be read from the beginning to build up the
dictionary.

Turn on WAL compression by default
--

Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254205#comment-13254205
 ] 

Lars Hofhansl commented on HBASE-5781:
--

@Jon: Are you saying sink rc1 for this?

 Zookeeper session got closed while trying to assign the region to RS using 
 hbck -fix
 

 Key: HBASE-5781
 URL: https://issues.apache.org/jira/browse/HBASE-5781
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.0, 0.96.0
Reporter: Kristam Subba Swathi
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.94.0


 After running the hbck in the cluster ,it is found that one region is not 
 assigned
 So the hbck -fix is used to fix this 
 But the assignment didnt happen since the zookeeper session is closed
 Please find the attached trace for more details
 -
 Trying to fix unassigned region...
 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
 waiting for it to become assigned: {NAME = 
 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY = 
 '002300', ENDKEY = '002311', ENCODED = 00871fbd7583512e12c4eb38e900be8d,}
 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
 Closed zookeeper sessionid=0x236738a263a
 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
 ERROR: Region { meta = 
 ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs = 
 hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
 = } not deployed on any region server.
 Trying to fix unassigned region...
 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
 to set watcher on znode (/hbase)
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
 at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
 at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
 at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
 hconnection-0x236738a263a Received unexpected KeeperException, 
 re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
 at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
 at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
 at

[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254207#comment-13254207
 ] 

Lars Hofhansl commented on HBASE-5790:
--

@Jesse: Was more referring to the request size that is shipped to ZK. Probably 
not a problem.


 ZKUtil deleteRecursively should be a recoverable operation
 --

 Key: HBASE-5790
 URL: https://issues.apache.org/jira/browse/HBASE-5790
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: zookeeper
 Fix For: 0.96.0, 0.94.1

 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch


 As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
 we can wholesale delete chunks of the zk tree and ensure that we don't have 
 any pesky recursive delete issues where we delete the children of a node, but 
 then a child joins before deletion of the parent. Even without transactions, 
 this should be the behavior, but it is possible to make it much cleaner now 
 that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254208#comment-13254208
 ] 

Lars Hofhansl commented on HBASE-5790:
--

+1 on V2

 ZKUtil deleteRecursively should be a recoverable operation
 --

 Key: HBASE-5790
 URL: https://issues.apache.org/jira/browse/HBASE-5790
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: zookeeper
 Fix For: 0.96.0, 0.94.1

 Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch


 As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
 we can wholesale delete chunks of the zk tree and ensure that we don't have 
 any pesky recursive delete issues where we delete the children of a node, but 
 then a child joins before deletion of the parent. Even without transactions, 
 this should be the behavior, but it is possible to make it much cleaner now 
 that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254211#comment-13254211
 ] 

Lars Hofhansl commented on HBASE-5781:
--

Dang :(

 Zookeeper session got closed while trying to assign the region to RS using 
 hbck -fix
 

 Key: HBASE-5781
 URL: https://issues.apache.org/jira/browse/HBASE-5781
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.0, 0.96.0
Reporter: Kristam Subba Swathi
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.94.0


 After running the hbck in the cluster ,it is found that one region is not 
 assigned
 So the hbck -fix is used to fix this 
 But the assignment didnt happen since the zookeeper session is closed
 Please find the attached trace for more details
 -
 Trying to fix unassigned region...
 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
 waiting for it to become assigned: {NAME = 
 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY = 
 '002300', ENDKEY = '002311', ENCODED = 00871fbd7583512e12c4eb38e900be8d,}
 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
 Closed zookeeper sessionid=0x236738a263a
 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
 ERROR: Region { meta = 
 ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs = 
 hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
 = } not deployed on any region server.
 Trying to fix unassigned region...
 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
 to set watcher on znode (/hbase)
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
 at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
 at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
 at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
 hconnection-0x236738a263a Received unexpected KeeperException, 
 re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
 at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
 at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
 at

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254260#comment-13254260
 ] 

Lars Hofhansl commented on HBASE-5781:
--

+1 on patch.

 Zookeeper session got closed while trying to assign the region to RS using 
 hbck -fix
 

 Key: HBASE-5781
 URL: https://issues.apache.org/jira/browse/HBASE-5781
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Kristam Subba Swathi
Assignee: Jonathan Hsieh
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: hbase-5781.patch


 After running the hbck in the cluster ,it is found that one region is not 
 assigned
 So the hbck -fix is used to fix this 
 But the assignment didnt happen since the zookeeper session is closed
 Please find the attached trace for more details
 -
 Trying to fix unassigned region...
 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
 waiting for it to become assigned: {NAME = 
 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY = 
 '002300', ENDKEY = '002311', ENCODED = 00871fbd7583512e12c4eb38e900be8d,}
 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
 Closed zookeeper sessionid=0x236738a263a
 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
 ERROR: Region { meta = 
 ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs = 
 hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
 = } not deployed on any region server.
 Trying to fix unassigned region...
 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
 to set watcher on znode (/hbase)
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
 at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
 at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
 at 
 org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
 at 
 org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
 at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
 at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
 hconnection-0x236738a263a Received unexpected KeeperException, 
 re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
 at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
 at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
 at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
 at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)

[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254262#comment-13254262
 ] 

Lars Hofhansl commented on HBASE-5256:
--

I see. This needs to be reverted from the 0.94 branch, otherwise it breaks 
compatibility with 0.92.

 Use WritableUtils.readVInt() in RegionLoad.readFields()
 ---

 Key: HBASE-5256
 URL: https://issues.apache.org/jira/browse/HBASE-5256
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-5256.trunk.v1.patch


 Currently in.readInt() is used in RegionLoad.readFields()
 More metrics would be added to RegionLoad in the future, we should utilize 
 WritableUtils.readVInt() to reduce the amount of data exchanged between 
 Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254263#comment-13254263
 ] 

Lars Hofhansl commented on HBASE-5256:
--

Or does it now? I think that affects the public interface.

 Use WritableUtils.readVInt() in RegionLoad.readFields()
 ---

 Key: HBASE-5256
 URL: https://issues.apache.org/jira/browse/HBASE-5256
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
Assignee: Mubarak Seyed
 Fix For: 0.94.0

 Attachments: HBASE-5256.trunk.v1.patch


 Currently in.readInt() is used in RegionLoad.readFields()
 More metrics would be added to RegionLoad in the future, we should utilize 
 WritableUtils.readVInt() to reduce the amount of data exchanged between 
 Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253549#comment-13253549
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Arghh... OK. So:
* in 0.94+ this is fixed, correct?
* you like to backport HBASE-5454 to 0.90 and 0.92, right?

So let's close this one then?

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5677-proposal.txt, 5677-proposal.txt, 5677-proposal.txt, 
 HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
 surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253588#comment-13253588
 ] 

Lars Hofhansl commented on HBASE-5778:
--

I still don't understand why this is a problem with replication. J-D do you 
have any insights?

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Lars Hofhansl
Priority: Blocker
 Fix For: 0.96.0, 0.94.1

 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253614#comment-13253614
]

Lars Hofhansl commented on HBASE-5778:
--

Oh I see. The KVs are only decompressed when read.

Turn on WAL compression by default
--

Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253765#comment-13253765
]

Lars Hofhansl commented on HBASE-5604:
--

Sigh... That's exactly what I was afraid about when adding data parsing.
What timezone is used to store the WAL edit? Is it GMT? If so, the date should
be interpreted as GMT.
If the data in the WAL is always the local TZ, I should remove the date
verification test (which in a sense is pointless anyway, because it just tests
whether SimpleDateFormat works).

M/R tool to replay WAL files

Key: HBASE-5604
URL: https://issues.apache.org/jira/browse/HBASE-5604
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Fix For: 0.94.0, 0.96.0

Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt,
5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt

Just an idea I had. Might be useful for restore of a backup using the HLogs.
This could an M/R (with a mapper per HLog file).
The tool would get a timerange and a (set of) table(s). We'd pick the right
HLogs based on time before the M/R job is started and then have a mapper per
HLog file.
The mapper would then go through the HLog, filter all WALEdits that didn't
fit into the time range or are not any of the tables and then uses
HFileOutputFormat to generate HFiles.
Would need to indicate the splits we want, probably from a live table.

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253824#comment-13253824
]

Lars Hofhansl commented on HBASE-5604:
--

@Stack and Ted: I'll just remove that test. The HFile pretty printer will show
time in the same TZ as WALPlayer would parse it, which is still useful.
I'll not file an extra jira but post an addendum to this one.

M/R tool to replay WAL files

Key: HBASE-5604
URL: https://issues.apache.org/jira/browse/HBASE-5604
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Fix For: 0.94.0, 0.96.0

Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt,
5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253839#comment-13253839
 ] 

Lars Hofhansl commented on HBASE-5604:
--

OK. Done. I'll use that build as RC1.

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253886#comment-13253886
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Did you have HBASE-5778 enabled?

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Priority: Critical

 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5790) ZKUtil deleteRecurisively should be a recoverable operation

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253986#comment-13253986
 ] 

Lars Hofhansl commented on HBASE-5790:
--

Patch looks good.
Will there be any request size problem if the subtree to delete is large?
Super minor nit: Should point out in Javadoc that deleteRecursively and an 
idempotent operation.


 ZKUtil deleteRecurisively should be a recoverable operation
 ---

 Key: HBASE-5790
 URL: https://issues.apache.org/jira/browse/HBASE-5790
 Project: HBase
  Issue Type: Improvement
Reporter: Jesse Yates
Assignee: Jesse Yates
  Labels: zookeeper
 Fix For: 0.96.0, 0.94.1

 Attachments: java_HBASE-5790.patch


 As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
 we can wholesale delete chunks of the zk tree and ensure that we don't have 
 any pesky recursive delete issues where we delete the children of a node, but 
 then a child joins before deletion of the parent. Even without transactions, 
 this should be the behavior, but it is possible to make it much cleaner now 
 that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253990#comment-13253990
 ] 

Lars Hofhansl commented on HBASE-5782:
--

You tried multiple times in 0.92.1, and no problem?
Any chance to capture this in a test? (Although that might be difficult)

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Priority: Critical

 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253995#comment-13253995
]

Lars Hofhansl commented on HBASE-5778:
--

This fundamentally break replication.

The problem above is actually that the HLogKey and WALEdit after being read
from a compressed HLog have the compression context set, and hence this will be
used to compress them when sent over the wire to the sink. Of course the sink
does not know how to uncompress.

So I just set the compression context to null in ReplicationSource.

With that hurdle out of the way, I find that seeking to a specific position in
the HLog (the position stored in ZK) does not work, because the dictionary is
not build up (compressed HLogs always need to read from the beginning).

Not sure how to fix the 2nd part.

Turn on WAL compression by default
--

Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253996#comment-13253996
 ] 

Lars Hofhansl commented on HBASE-5782:
--

And is there anything interesting in the log files?

 Not all the regions are getting assigned after the log splitting.
 -

 Key: HBASE-5782
 URL: https://issues.apache.org/jira/browse/HBASE-5782
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.94.0
Reporter: Gopinathan A
Priority: Critical

 Create a table with 1000 splits, after the region assignemnt, kill the 
 regionserver wich contains META table.
 Here few regions are missing after the log splitting and region assigment. 
 HBCK report shows multiple region holes are got created.
 Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252530#comment-13252530
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Since this is only new code I propose this for 0.94.0 as well.

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252533#comment-13252533
 ] 

Lars Hofhansl commented on HBASE-5677:
--

@Stack: This seems like a good addition to 0.94.
Let's say if we can track down the test failures today we'll put this in 0.94.0 
otherwise 0.94.1

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
 surefire-report_no_patched_v1.html, surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252726#comment-13252726
 ] 

Lars Hofhansl commented on HBASE-3443:
--

@Stack: Wasn't sure about this one. Didn't feel right to add this to a 
performance release :)

You think this should go into 0.94? Might be good to have this correctness fix 
early.

 ICV optimization to look in memstore first and then store files (HBASE-3082) 
 does not work when deletes are in the mix
 --

 Key: HBASE-3443
 URL: https://issues.apache.org/jira/browse/HBASE-3443
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
 0.92.0, 0.92.1
Reporter: Kannan Muthukkaruppan
Assignee: Lars Hofhansl
Priority: Critical
  Labels: corruption
 Fix For: 0.96.0

 Attachments: 3443.txt


 For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
 first, and only if not present in the memstore then check the store files. In 
 the presence of deletes, the above optimization is not reliable.
 If the column is marked as deleted in the memstore, one should not look 
 further into the store files. But currently, the code does so.
 Sample test code outline:
 {code}
 admin.createTable(desc)
 table = HTable.new(conf, tableName)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 admin.flush(tableName)
 sleep(2)
 del = Delete.new(Bytes.toBytes(row))
 table.delete(del)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 get = Get.new(Bytes.toBytes(row))
 keyValues = table.get(get).raw()
 keyValues.each do |keyValue|
   puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())};
 end
 {code}
 The above prints:
 {code}
 Expect 5; Got Value=10
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5774) Add documentation for WALPlayer to HBase reference guide.

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252755#comment-13252755
 ] 

Lars Hofhansl commented on HBASE-5774:
--

Thanks Doug. I think I'll just roll this up into parent.

 Add documentation for WALPlayer to HBase reference guide.
 -

 Key: HBASE-5774
 URL: https://issues.apache.org/jira/browse/HBASE-5774
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 5774.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5775) ZKUtil doesn't handle deleteRecurisively cleanly

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252831#comment-13252831
 ] 

Lars Hofhansl commented on HBASE-5775:
--

Will commit when HadoopQA is done.

 ZKUtil doesn't handle deleteRecurisively cleanly
 

 Key: HBASE-5775
 URL: https://issues.apache.org/jira/browse/HBASE-5775
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.0, 0.96.0

 Attachments: java_HBASE-5775.patch


 ZKUtil.deleteNodeRecursively()'s contract says that it handles deletion of 
 the node and all its children. However, nothing is mentioned as to what 
 happens if the node you are attempting to delete doesn't actually exist. 
 Turns out, it throws a null pointer exception. I
 'm proposing that we change the code s.t. it handles the case where the 
 parent is already gone and exits cleanly, rather than failing horribly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5775) ZKUtil doesn't handle deleteRecurisively cleanly

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252889#comment-13252889
 ] 

Lars Hofhansl commented on HBASE-5775:
--

Thanks for the patch.

 ZKUtil doesn't handle deleteRecurisively cleanly
 

 Key: HBASE-5775
 URL: https://issues.apache.org/jira/browse/HBASE-5775
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.0, 0.96.0

 Attachments: java_HBASE-5775.patch


 ZKUtil.deleteNodeRecursively()'s contract says that it handles deletion of 
 the node and all its children. However, nothing is mentioned as to what 
 happens if the node you are attempting to delete doesn't actually exist. 
 Turns out, it throws a null pointer exception. I
 'm proposing that we change the code s.t. it handles the case where the 
 parent is already gone and exits cleanly, rather than failing horribly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252892#comment-13252892
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Ted: Are you OK with latest patch? If so, I'll commit today.

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252934#comment-13252934
 ] 

Lars Hofhansl commented on HBASE-5604:
--

And I did insert the spaces before commit :)

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252939#comment-13252939
 ] 

Lars Hofhansl commented on HBASE-5778:
--

+1 on patch

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252950#comment-13252950
 ] 

Lars Hofhansl commented on HBASE-5677:
--

I ran TestReplication with this patch (5577-proposal.txt) and it passed fine.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
 surefire-report_no_patched_v1.html, surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252953#comment-13252953
 ] 

Lars Hofhansl commented on HBASE-5677:
--

TestImportTsv passes as well.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5677-proposal.txt, 5677-proposal.txt, 
 HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
 surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253046#comment-13253046
 ] 

Lars Hofhansl commented on HBASE-5677:
--

@xufeng: I made a trunk patch, so that I can get a HadoopQA test run.

All the failures are unrelated, though. In fact they are all because of 
negative array sizes if WALEdit de-serialization, which is suspicious.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5677-proposal.txt, 5677-proposal.txt, 5677-proposal.txt, 
 HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
 surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253053#comment-13253053
 ] 

Lars Hofhansl commented on HBASE-5778:
--

I see a bunch of suspicious test failures now:
{code}
java.lang.NegativeArraySizeException
at 
org.apache.hadoop.hbase.regionserver.wal.HLogKey.readFields(HLogKey.java:305)
{code}


 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253058#comment-13253058
]

Lars Hofhansl commented on HBASE-5778:
--

Yeah... The failures in TestHLog are because of this. Need to rollback or
figure out what the problem is. Probably test related.

Turn on WAL compression by default
--

Key: HBASE-5778
URL: https://issues.apache.org/jira/browse/HBASE-5778
Project: HBase
Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
Fix For: 0.94.0, 0.96.0

Attachments: HBASE-5778.patch

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253061#comment-13253061
 ] 

Lars Hofhansl commented on HBASE-5778:
--

TestHLog.testAppendClose() uses this to read back the WALEdits:
{code}
// Make sure you can read all the content
SequenceFile.Reader reader
  = new SequenceFile.Reader(this.fs, walPath, this.conf);
{code}
Well, dah, that does not work.

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253063#comment-13253063
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Then there's FaultySequenceFileLogReader, which does not do the right thing.

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253065#comment-13253065
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Have a fix for TestHLog, working on TestHLogSplit

 Turn on WAL compression by default
 --

 Key: HBASE-5778
 URL: https://issues.apache.org/jira/browse/HBASE-5778
 Project: HBase
  Issue Type: Improvement
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Blocker
 Fix For: 0.94.0, 0.96.0

 Attachments: HBASE-5778.patch


 I ran some tests to verify if WAL compression should be turned on by default.
 For a use case where it's not very useful (values two order of magnitude 
 bigger than the keys), the insert time wasn't different and the CPU usage 15% 
 higher (150% CPU usage VS 130% when not compressing the WAL).
 When values are smaller than the keys, I saw a 38% improvement for the insert 
 run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
 WAL compression accounts for all the additional CPU usage, it might just be 
 that we're able to insert faster and we spend more time in the MemStore per 
 second (because our MemStores are bad when they contain tens of thousands of 
 values).
 Those are two extremes, but it shows that for the price of some CPU we can 
 save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
 CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253091#comment-13253091
]

Lars Hofhansl commented on HBASE-5778:
--

Similar... There're some other issues. I'll have a patch soon.

Turn on WAL compression by default
--

Attachments: 5778.addendum, HBASE-5778.patch

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253094#comment-13253094
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Yes, should do an addendum. Will mark as large.

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253095#comment-13253095
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Done. Thanks Ted.

 M/R tool to replay WAL files
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.0, 0.96.0

 Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253118#comment-13253118
]

Lars Hofhansl commented on HBASE-5778:
--

mvn failed with an OOME. Let's revert this change, until we track these issues
down.

Turn on WAL compression by default
--

Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252057#comment-13252057
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Sigh... I think these are test-related issues. In a real cluster the Master is 
useless unless initialized (although I wouldn't know if that holds for a backup 
master as well).
I'll have a look at these when I get a chance.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
 surefire-report_no_patched_v1.html, surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252064#comment-13252064
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Sorry for chiming in late here, but do we actually want ImportTsv to create the 
table for us?
Personally I'd rather have it fail, which forces me to think about how I want 
create the table (pre split, compression, etc).


 ImportTsv does not check for table existence 
 -

 Key: HBASE-5741
 URL: https://issues.apache.org/jira/browse/HBASE-5741
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Clint Heath
Assignee: Himanshu Vashishtha
 Fix For: 0.94.0, 0.96.0

 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
 HBase-5741.patch


 The usage statement for the importtsv command to hbase claims this:
 Note: if you do not use this option, then the target table must already 
 exist in HBase (in reference to the importtsv.bulk.output command-line 
 option)
 The truth is, the table must exist no matter what, importtsv cannot and will 
 not create it for you.
 This is the case because the createSubmittableJob method of ImportTsv does 
 not even attempt to check if the table exists already, much less create it:
 (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
 305 HTable table = new HTable(conf, tableName);
 The HTable method signature in use there assumes the table exists and runs a 
 meta scan on it:
 (From org.apache.hadoop.hbase.client.HTable.java)
 142 * Creates an object to access a HBase table.
 ...
 151 public HTable(Configuration conf, final String tableName)
 What we should do inside of createSubmittableJob is something similar to what 
 the completebulkloads command would do:
 (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
 690 boolean tableExists = this.doesTableExist(tableName);
 691 if (!tableExists) this.createTable(tableName,dirPath);
 Currently the docs are misleading, the table in fact must exist prior to 
 running importtsv. We should check if it exists rather than assume it's 
 already there and throw the below exception:
 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
 Encountered problems when prefetch META table: 
 org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
 table: myTable2, row=myTable2,,99
   at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252076#comment-13252076
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Import does not do that either. At least we should be consistent between the 
various importing tools.
It seems overkill to burden these tools with that.

That said, I am not opposed, just questioning the usefulness.

 ImportTsv does not check for table existence 
 -

 Key: HBASE-5741
 URL: https://issues.apache.org/jira/browse/HBASE-5741
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Clint Heath
Assignee: Himanshu Vashishtha
 Fix For: 0.94.0, 0.96.0

 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
 HBase-5741.patch


 The usage statement for the importtsv command to hbase claims this:
 Note: if you do not use this option, then the target table must already 
 exist in HBase (in reference to the importtsv.bulk.output command-line 
 option)
 The truth is, the table must exist no matter what, importtsv cannot and will 
 not create it for you.
 This is the case because the createSubmittableJob method of ImportTsv does 
 not even attempt to check if the table exists already, much less create it:
 (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
 305 HTable table = new HTable(conf, tableName);
 The HTable method signature in use there assumes the table exists and runs a 
 meta scan on it:
 (From org.apache.hadoop.hbase.client.HTable.java)
 142 * Creates an object to access a HBase table.
 ...
 151 public HTable(Configuration conf, final String tableName)
 What we should do inside of createSubmittableJob is something similar to what 
 the completebulkloads command would do:
 (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
 690 boolean tableExists = this.doesTableExist(tableName);
 691 if (!tableExists) this.createTable(tableName,dirPath);
 Currently the docs are misleading, the table in fact must exist prior to 
 running importtsv. We should check if it exists rather than assume it's 
 already there and throw the below exception:
 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
 Encountered problems when prefetch META table: 
 org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
 table: myTable2, row=myTable2,,99
   at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252095#comment-13252095
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Even the created HFileOutputFormat use the table in HBase to find out about 
splits. We will rarely bulk import into an empty table that is not 
pre-split...? (I'm assuming here, you folks at Cloudera see many more use cases 
that I do)

If you believe strongly that we should add this, let's do it :)
We can do Import in a separate patch, or not do that as we see fit.


 ImportTsv does not check for table existence 
 -

 Key: HBASE-5741
 URL: https://issues.apache.org/jira/browse/HBASE-5741
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.4
Reporter: Clint Heath
Assignee: Himanshu Vashishtha
 Fix For: 0.94.0, 0.96.0

 Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
 HBase-5741.patch


 The usage statement for the importtsv command to hbase claims this:
 Note: if you do not use this option, then the target table must already 
 exist in HBase (in reference to the importtsv.bulk.output command-line 
 option)
 The truth is, the table must exist no matter what, importtsv cannot and will 
 not create it for you.
 This is the case because the createSubmittableJob method of ImportTsv does 
 not even attempt to check if the table exists already, much less create it:
 (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
 305 HTable table = new HTable(conf, tableName);
 The HTable method signature in use there assumes the table exists and runs a 
 meta scan on it:
 (From org.apache.hadoop.hbase.client.HTable.java)
 142 * Creates an object to access a HBase table.
 ...
 151 public HTable(Configuration conf, final String tableName)
 What we should do inside of createSubmittableJob is something similar to what 
 the completebulkloads command would do:
 (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
 690 boolean tableExists = this.doesTableExist(tableName);
 691 if (!tableExists) this.createTable(tableName,dirPath);
 Currently the docs are misleading, the table in fact must exist prior to 
 running importtsv. We should check if it exists rather than assume it's 
 already there and throw the below exception:
 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
 Encountered problems when prefetch META table: 
 org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
 table: myTable2, row=myTable2,,99
   at 
 org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
 ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250959#comment-13250959
 ] 

Lars Hofhansl commented on HBASE-5604:
--

* Let's not overkill on the exceptions. This is an inner class of WALPlayer, 
WALPlayer will always pass the correct arguments. Maybe I'll make the mapper 
classed private and remove all exception handling.
* Thought about using SimpleDateFormat, then punted. I guess you're right, 
should make it bit more user friendly.
* The TODO was left over. Not sure that even makes sense.


 M/R tools to replay WAL files
 -

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
 HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250968#comment-13250968
 ] 

Lars Hofhansl commented on HBASE-3443:
--

For 0.96, though.
I can take this. Can't promise to work on it today, though.
Had checked out the code a while ago to make ICV correct w.r.t. to WAL updates, 
this involves mostly removing some crufty code.


 ICV optimization to look in memstore first and then store files (HBASE-3082) 
 does not work when deletes are in the mix
 --

 Key: HBASE-3443
 URL: https://issues.apache.org/jira/browse/HBASE-3443
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
 0.92.0, 0.92.1
Reporter: Kannan Muthukkaruppan
Priority: Critical
  Labels: corruption

 For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
 first, and only if not present in the memstore then check the store files. In 
 the presence of deletes, the above optimization is not reliable.
 If the column is marked as deleted in the memstore, one should not look 
 further into the store files. But currently, the code does so.
 Sample test code outline:
 {code}
 admin.createTable(desc)
 table = HTable.new(conf, tableName)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 admin.flush(tableName)
 sleep(2)
 del = Delete.new(Bytes.toBytes(row))
 table.delete(del)
 table.incrementColumnValue(Bytes.toBytes(row), cf1name, 
 Bytes.toBytes(column), 5);
 get = Get.new(Bytes.toBytes(row))
 keyValues = table.get(get).raw()
 keyValues.each do |keyValue|
   puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())};
 end
 {code}
 The above prints:
 {code}
 Expect 5; Got Value=10
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5728) Methods Missing in HTableInterface

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250985#comment-13250985
 ] 

Lars Hofhansl commented on HBASE-5728:
--

{code}
public int getScannerCaching();
public void setScannerCaching(int scannerCaching);
{code}
These two are deprecated (in trunk at least), let's not add them to the 
interface.

 Methods Missing in HTableInterface
 --

 Key: HBASE-5728
 URL: https://issues.apache.org/jira/browse/HBASE-5728
 Project: HBase
  Issue Type: Improvement
  Components: client
Reporter: Bing Li

 Dear all,
 I found some methods existed in HTable were not in HTableInterface.
setAutoFlush
setWriteBufferSize
...
 In most cases, I manipulate HBase through HTableInterface from HTablePool. If 
 I need to use the above methods, how to do that?
 I am considering writing my own table pool if no proper ways. Is it fine?
 Thanks so much!
 Best regards,
 Bing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251006#comment-13251006
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Is this at all similar to HBASE-5615?
Maybe here too, there is no point holding up 0.94.0 for this.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5677-90-v1.patch, 
 surefire-report_no_patched_v1.html, surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251053#comment-13251053
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Stack: Yeah, probably. You'd vote for leaving it the way it is?

 M/R tools to replay WAL files
 -

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
 HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251189#comment-13251189
 ] 

Lars Hofhansl commented on HBASE-5677:
--

bq. public boolean isMasterRunning() { return !isStopped()  isInitialized(); 
} But Some class like HMerge,this tool just care the master is running or not.

Is that actually a problem? HMerge would need to wait until the master is 
initialized. Seems generally a better a better condition for running than 
just having the process up.

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: HBASE-5677-90-v1.patch, 
 surefire-report_no_patched_v1.html, surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5756) we can change defalult File Appender to RFA instead of DRFA.

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251315#comment-13251315
 ] 

Lars Hofhansl commented on HBASE-5756:
--

Anybody can still change the log4j.properties to get the desired behavior. I 
see no need to put into 0.94.

That said, the change in HBASE-5655 is simple, if you believe strongly I'll put 
it into 0.94.


 we can change defalult File Appender to RFA instead of DRFA.
 

 Key: HBASE-5756
 URL: https://issues.apache.org/jira/browse/HBASE-5756
 Project: HBase
  Issue Type: Bug
Reporter: rohithsharma
Priority: Minor

 This can be a point of concern when on a certain day the logging happens more 
 because of more and more activity. In that case the log file for that day can 
 grow huge. These logs can not be opened for analysis since size is more.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5755) Region sever looking for master forever with cached stale data.

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13251327#comment-13251327
 ] 

Lars Hofhansl commented on HBASE-5755:
--

Looks like MasterAddressTracker.java was moved from 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase to 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper in 0.96

 Region sever looking for master forever with cached stale data.
 ---

 Key: HBASE-5755
 URL: https://issues.apache.org/jira/browse/HBASE-5755
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: 0.96.0

 Attachments: hbase-5755.patch


 When the master address tracker doesn't have the master address ZK data, or 
 the cached data is wrong, region server should not use the cached data.
 It should pull the data from ZK directly again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5751) hbase master stop does not bring down backup masters

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250056#comment-13250056
]

Lars Hofhansl commented on HBASE-5751:
--

Tail of discussion from parent:

stack added a comment - 06/Apr/12 15:03

This seems to have broken 0.90 builds. Can we revert from 0.90 trunk till
fixed. See here: https://builds.apache.org/view/G-L/view/HBase/job/hbase-0.90/
Build #458 is where it got committed and thereafter the build fails rolling the
wal test.

David S. Wang added a comment - 06/Apr/12 15:18

Greg is out of town for the next week or so ... IMHO if it's breaking 0.90, we
should revert for now and Greg can look at it when he gets back.

Zhihong Yu added a comment - 06/Apr/12 16:11

TestLogRolling hangs in 0.90

Jonathan Hsieh added a comment - 06/Apr/12 16:21

I'll revert the 0.90 version. Sorry about this fellas.

Lars Hofhansl added a comment - 09/Apr/12 03:31

So from the discussion the test is only a problem in 0.90? Do we know why this
is?
Asking because we're close to another RC attempt for 0.94.0.

Jonathan Hsieh added a comment - 09/Apr/12 03:37

The issue is apparently only in 0.90. How about we close this issue for
0.92/0.94/trunk and create a follow on issue for 0.90? This will unblock this
for 0.94 and Greg can address this when he gets back.

hbase master stop does not bring down backup masters
--

Key: HBASE-5751
URL: https://issues.apache.org/jira/browse/HBASE-5751
Project: HBase
Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Gregory Chanan
Fix For: 0.90.7

Carry forward the discussion from parent for 0.90

[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250069#comment-13250069
 ] 

Lars Hofhansl commented on HBASE-5615:
--

Thanks Ram... I'll hold up 0.94 for this (unless you think that unnecessary).

 the master never does balance because of balancing the parent region
 

 Key: HBASE-5615
 URL: https://issues.apache.org/jira/browse/HBASE-5615
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.7
Reporter: xufeng
Assignee: xufeng
Priority: Critical
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
 NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html


 the master never do balance becauseof when master do rebuildUserRegions()，it 
 will add the parent region into  AssignmentManager#servers,
 if balancer let the parent region to move,the parent will in RIT forever.thus 
 balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250336#comment-13250336
 ] 

Lars Hofhansl commented on HBASE-5604:
--

TestForceCacheImportantBlocks passes locally. I'm ready to commit.

 HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
 

 Key: HBASE-5604
 URL: https://issues.apache.org/jira/browse/HBASE-5604
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, HLog-5604-v3.txt


 Just an idea I had. Might be useful for restore of a backup using the HLogs.
 This could an M/R (with a mapper per HLog file).
 The tool would get a timerange and a (set of) table(s). We'd pick the right 
 HLogs based on time before the M/R job is started and then have a mapper per 
 HLog file.
 The mapper would then go through the HLog, filter all WALEdits that didn't 
 fit into the time range or are not any of the tables and then uses 
 HFileOutputFormat to generate HFiles.
 Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250439#comment-13250439
]

Lars Hofhansl commented on HBASE-5604:
--

Thanks Ted.

* I'll add Javadoc.
* index of 0 is used here, since when creating HFiles for bulk import only a
single table is currently allowed (that is also documented in usage(), but
perhaps not clearly enough...?). I'll add a comment to that extent.
* That the two arrays are of the same size if guaranteed in
WALPlayer.createSubmittableJob(), but perhaps it is better to double check here.
* Checking heapSize() seems unnecessary. After all, this is single WALEdit.

M/R tools to replay WAL files
-

Key: HBASE-5604
URL: https://issues.apache.org/jira/browse/HBASE-5604
Project: HBase
Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, HLog-5604-v3.txt

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13250445#comment-13250445
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Seems this should in 0.94.0. Agreed?

 The master never does balance because duplicate openhandled the one region
 --

 Key: HBASE-5677
 URL: https://issues.apache.org/jira/browse/HBASE-5677
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.6
 Environment: 0.90
Reporter: xufeng
Assignee: xufeng
 Attachments: HBASE-5677-90-v1.patch, 
 surefire-report_no_patched_v1.html, surefire-report_patched_v1.html


 If region be assigned When the master is doing initialization(before do 
 processFailover),the region will be duplicate openhandled.
 because the unassigned node in zookeeper will be handled again in 
 AssignmentManager#processFailover()
 it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5656) LoadIncrementalHFiles createTable should detect and set compression algorithm

2012-04-08 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249567#comment-13249567
 ] 

Lars Hofhansl commented on HBASE-5656:
--

+1 on patch. Failures seems unrelated.

 LoadIncrementalHFiles createTable should detect and set compression algorithm
 -

 Key: HBASE-5656
 URL: https://issues.apache.org/jira/browse/HBASE-5656
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.92.1
Reporter: Cosmin Lehene
Assignee: Cosmin Lehene
 Fix For: 0.92.2, 0.94.0, 0.96.0

 Attachments: 5656-simple.txt, HBASE-5656-0.92.patch, 
 HBASE-5656-0.92.patch, HBASE-5656-0.92.patch, HBASE-5656-0.92.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 LoadIncrementalHFiles doesn't set compression when creating the the table.
 This can be detected from the files within each family dir. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-1936) ClassLoader that loads from hdfs; useful adding filters to classpath without having to restart services

2012-04-08 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249656#comment-13249656
 ] 

Lars Hofhansl commented on HBASE-1936:
--

We have since solved this problem with coprocessors, right?
It might make sense to look at how coprocessors handle loading from HDFS and 
(hopefully) reuse the bits.

Thanks for picking this up again Jieshan.

 ClassLoader that loads from hdfs; useful adding filters to classpath without 
 having to restart services
 ---

 Key: HBASE-1936
 URL: https://issues.apache.org/jira/browse/HBASE-1936
 Project: HBase
  Issue Type: New Feature
Reporter: stack
Assignee: Daniel Ploeg
  Labels: noob
 Attachments: cp_from_hdfs.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5213) hbase master stop does not bring down backup masters

2012-04-08 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249658#comment-13249658
 ] 

Lars Hofhansl commented on HBASE-5213:
--

So from the discussion the test is only a problem in 0.90? Do we know why this 
is?
Asking because we're close to another RC attempt for 0.94.0.

 hbase master stop does not bring down backup masters
 --

 Key: HBASE-5213
 URL: https://issues.apache.org/jira/browse/HBASE-5213
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0

 Attachments: 5213.jstack, HBASE-5213-v0-trunk.patch, 
 HBASE-5213-v1-trunk.patch, HBASE-5213-v2-90.patch, HBASE-5213-v2-92.patch, 
 HBASE-5213-v2-trunk.patch


 Typing hbase master stop produces the following message:
 stop   Start cluster shutdown; Master signals RegionServer shutdown
 It seems like backup masters should be considered part of the cluster, but 
 they are not brought down by hbase master stop.
 stop-hbase.sh does correctly bring down the backup masters.
 The same behavior is observed when a client app makes use of the client API 
 HBaseAdmin.shutdown() 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown()
  -- this isn't too surprising since I think hbase master stop just calls 
 this API.
 It seems like HBASE-1448 address this; perhaps there was a regression?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5720) HFileDataBlockEncoderImpl uses wrong header size when reading HFiles with no checksums

2012-04-07 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249328#comment-13249328
 ] 

Lars Hofhansl commented on HBASE-5720:
--

That patch looks like it should work. As I mention in an earlier comment, 
though, this strategy is brittle especially considering future changes. At the 
same time I cannot think of anything better offhand.

Let me move the trunk discussion into another jira and commit the 0.94 fix via 
this one. Everybody OK with that?


 HFileDataBlockEncoderImpl uses wrong header size when reading HFiles with no 
 checksums
 --

 Key: HBASE-5720
 URL: https://issues.apache.org/jira/browse/HBASE-5720
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.94.0
Reporter: Matt Corgan
Priority: Blocker
 Fix For: 0.94.0

 Attachments: 5720-trunk-v2.txt, 5720-trunk.txt, 5720v4.txt, 
 5720v4.txt, 5720v4.txt, HBASE-5720-v1.patch, HBASE-5720-v2.patch, 
 HBASE-5720-v3.patch


 When reading a .92 HFile without checksums, encoding it, and storing in the 
 block cache, the HFileDataBlockEncoderImpl always allocates a dummy header 
 appropriate for checksums even though there are none.  This corrupts the 
 byte[].
 Attaching a patch that allocates a DUMMY_HEADER_NO_CHECKSUM in that case 
 which I think is the desired behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1098 matches

Mail list logo