from:"Lars Hofhansl \(Commented\) \(JIRA\)"

[jira] [Commented] (HBASE-5635) If getTaskList() returns null splitlogWorker is down. It wont serve any requests.

2012-04-20 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258414#comment-13258414
 ] 

Lars Hofhansl commented on HBASE-5635:
--

+1 on patch. I wonder how many more problems like this we have lurking in the 
worker threads.
Can you do a 0.94.1 patch as well?

> If getTaskList() returns null splitlogWorker is down. It wont serve any 
> requests. 
> --
>
> Key: HBASE-5635
> URL: https://issues.apache.org/jira/browse/HBASE-5635
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.92.1
>Reporter: Kristam Subba Swathi
> Attachments: HBASE-5635.1.patch, HBASE-5635.2.patch, HBASE-5635.patch
>
>
> During the hlog split operation if all the zookeepers are down ,then the 
> paths will be returned as null and the splitworker thread wil be exited
> Now this regionserver wil not be able to acquire any other tasks since the 
> splitworker thread is exited
> Please find the attached code for more details
> {code}
> private List getTaskList() {
> for (int i = 0; i < zkretries; i++) {
>   try {
> return (ZKUtil.listChildrenAndWatchForNewChildren(this.watcher,
> this.watcher.splitLogZNode));
>   } catch (KeeperException e) {
> LOG.warn("Could not get children of znode " +
> this.watcher.splitLogZNode, e);
> try {
>   Thread.sleep(1000);
> } catch (InterruptedException e1) {
>   LOG.warn("Interrupted while trying to get task list ...", e1);
>   Thread.currentThread().interrupt();
>   return null;
> }
>   }
> }
> {code}
> in the org.apache.hadoop.hbase.regionserver.SplitLogWorker 
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-04-19 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257818#comment-13257818
 ] 

Lars Hofhansl commented on HBASE-5547:
--

For this I was actually thinking a single cluster-wide setting ("This HBase 
cluster is now in backup mode"), which would affect all tables would be 
sufficient.

But could do per table as well as Jesse has done and it definitely does not 
need to be a custom znode, checking a value is fine.


> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to .bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257114#comment-13257114
 ] 

Lars Hofhansl commented on HBASE-5547:
--

What I had in mind when I filed this was something quite simple: A single znode 
that all RegionServers would check when deleting any HFile and then based on 
existence or value of that znode either delete the HFile or rename it. Of 
things are never quite as simple...

To get a guaranteed consistent snapshot the RegionServers need to check for the 
znode's value synchronously in the delete path (or at least I see no other 
way). Otherwise there are times when the RegionServers do not agree and some 
files will be deleted and some will be backed up with no possibility for the 
client to know exactly as of when the backup would be consistent.

Since HFiles are deleted as result of a compaction in an asynchronous thread, 
synchronously checking the znode should not cause performance issues, unless we 
fear we'll overload ZK.

This simple solution would add special code for just this scenario, which is 
bad. At the same time it would be relatively simple (famous last words), so 
that's something to weigh.

> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
>
> This came up in a discussion I had with Stack.
> It would be nice if HBase could be notified that a backup is in progress (via 
> a znode for example) and in that case either:
> 1. rename HFiles to be delete to .bck
> 2. rename the HFiles into a special directory
> 3. rename them to a general trash directory (which would not need to be tied 
> to backup mode).
> That way it should be able to get a consistent backup based on HFiles (HDFS 
> snapshots or hard links would be better options here, but we do not have 
> those).
> #1 makes cleanup a bit harder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13257105#comment-13257105
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Yes.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
> 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
> hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256938#comment-13256938
 ] 

Lars Hofhansl commented on HBASE-5545:
--

I'm going to commit this 0.94 and 0.96 in the next few minutes.

> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7, 0.92.2, 0.94.0
>
> Attachments: HBASE-5545.patch, HBASE-5545.patch
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> [2012-03-07 20:51:45,858] [WARN ] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
>

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256933#comment-13256933
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Now note, that TestHLog still does not run anything locally (neither in 0.94 or 
trunk), there is something with that specific test it seems.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
> 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
> hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256757#comment-13256757
 ] 

Lars Hofhansl commented on HBASE-5545:
--

@Ram: yes that is what I meant to say (recursive is not needed, and somebody 
might misunderstand later and use this to delete tmp directory). Not important.

+1 on commit as is.

> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.7, 0.92.2, 0.94.0
>
> Attachments: HBASE-5545.patch, HBASE-5545.patch
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-18 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256756#comment-13256756
 ] 

Lars Hofhansl commented on HBASE-5782:
--

For some reason I cannot run TestHLog locally. I always get:
{code}
Tests run: 0, Failures: 0, Errors: 0, Skipped: 0
{code}

I can run all other tests that I tried (including some others marked with 
@LargeTests). Verified again with HLogPerformanceEvaluation manually.
Should we put HBASE-5792 in 0.94 as well, so that we use this test?

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782-v3.txt, 
> 5782.txt, 5782.unfinished-stack.txt, 5782.unittest.txt, HBASE-5782.patch, 
> hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256210#comment-13256210
 ] 

Lars Hofhansl commented on HBASE-5545:
--

2nd patch looks good. I don't think you needed to the add the FSUtil code, but 
it cannot hurt.
fs.delete does not throw if the file does not exist, right?
Do you want to set recursive to false (just in case somebody changes this 
around ends up pointing to a directory)... This is a super minor nit.

+1 on commit if delete does not throw on non-existant file.

> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: gaojinchao
> Fix For: 0.90.7, 0.92.2, 0.94.0
>
> Attachments: HBASE-5545.patch, HBASE-5545.patch
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.Th

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256206#comment-13256206
 ] 

Lars Hofhansl commented on HBASE-5545:
--

I see. Yep, that is more likely to happen than a DN crash (I think).
No hurry Ram :)


> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: gaojinchao
> Fix For: 0.90.7, 0.92.2, 0.94.0
>
> Attachments: HBASE-5545.patch
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> [2012-03-07 20:51:45,858] [WARN ] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
> [com.hu

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256202#comment-13256202
 ] 

Lars Hofhansl commented on HBASE-5782:
--

I can work on a test, unless you like to Stack, or maybe you have started on it 
anyway.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256200#comment-13256200
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Yes, we need a test for this. What scared me most about this bug was that no 
test caught it, and this one of *the* core parts of HBase.

This test would basically just call code from HLogPerformanceEvaluation, right?


> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256198#comment-13256198
 ] 

Lars Hofhansl commented on HBASE-5545:
--

@Ram: Isn't this a pretty remote corner case?
If we feel strongly that this should go into 0.94.0, let's get it in.


> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: gaojinchao
> Fix For: 0.90.7, 0.92.2, 0.94.0
>
> Attachments: HBASE-5545.patch
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> [2012-03-07 20:51:45,858] [WARN ] 
> [RS_OPEN_REGION-158-1-131-48,200

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256094#comment-13256094
 ] 

Lars Hofhansl commented on HBASE-5782:
--

I'm +1 on committing on my patch for 0.94.0. We can then either revisit for 
0.94.1+ or 0.96.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256083#comment-13256083
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Thanks Clint. No harm, I just think it will hide complexity that should not be 
hidden. I also don't think it's too much to ask a user to create the table 
first (but we should definitely fix the Javadoc in that case).

That all said, I am +-0 on this. It's a simple change, and v3 looks good.

> ImportTsv does not check for table existence 
> -
>
> Key: HBASE-5741
> URL: https://issues.apache.org/jira/browse/HBASE-5741
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.4
>Reporter: Clint Heath
>Assignee: Himanshu Vashishtha
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
> HBase-5741.patch
>
>
> The usage statement for the "importtsv" command to hbase claims this:
> "Note: if you do not use this option, then the target table must already 
> exist in HBase" (in reference to the "importtsv.bulk.output" command-line 
> option)
> The truth is, the table must exist no matter what, importtsv cannot and will 
> not create it for you.
> This is the case because the createSubmittableJob method of ImportTsv does 
> not even attempt to check if the table exists already, much less create it:
> (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
> 305 HTable table = new HTable(conf, tableName);
> The HTable method signature in use there assumes the table exists and runs a 
> meta scan on it:
> (From org.apache.hadoop.hbase.client.HTable.java)
> 142 * Creates an object to access a HBase table.
> ...
> 151 public HTable(Configuration conf, final String tableName)
> What we should do inside of createSubmittableJob is something similar to what 
> the "completebulkloads" command would do:
> (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
> 690 boolean tableExists = this.doesTableExist(tableName);
> 691 if (!tableExists) this.createTable(tableName,dirPath);
> Currently the docs are misleading, the table in fact must exist prior to 
> running importtsv. We should check if it exists rather than assume it's 
> already there and throw the below exception:
> 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
> Encountered problems when prefetch META table: 
> org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
> table: myTable2, row=myTable2,,99
>   at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256072#comment-13256072
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Did some more tests with HLogPerformanceEvaluation (and my patch):

10 threads, 10 iterations: with patch: 42s, without patch: 41s
5 threads, 20 iterations:  with patch: 46s, without patch: 44s
2 threads, 20 iterations:  with patch: 46s, without patch: 44s

So for fewer threads my patch is slightly slower.

So... What do we do? :)

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13256058#comment-13256058
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Thanks Stack!

I'll run a test with fewer threads too, just to make sure, the fact that both 
of our patches are faster probably means that we did a lot of unnecessary 
sync'ing before...? 20% performance increase is pretty damn awesome.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255974#comment-13255974
 ] 

Lars Hofhansl commented on HBASE-5782:
--

@Stack: I ran it too. It works fine. The "interesting" thing I find that it is 
faster *with* the patch! And that scares me.
I ran with 100 threads, 1 iterations. With the patch it took 29s, without 
it took 43s. This is on a 6 core machine with hyper threading.

Now, the HLogPerformanceEvaluation does not work with -nosync and -verify, 
right? (presumably because no final sync is issued).

Wouldn't mind to get some other numbers from you as well if you had some time.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255939#comment-13255939
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Ah, I see. Yes, doneUpto line should be pulled into the synchronized block.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255911#comment-13255911
 ] 

Lars Hofhansl commented on HBASE-5782:
--

@Stack: the doneUpTo is fine, because it is only used to set syncedTillHere. We 
won't write more into the log (once we take the pendingWrites they are gone), 
but we might sync too much unnecessarily.
Will do the (performance) testing now. I don't have a cluster at my disposal 
atm, so I'll do it with a local HDFS instance.

If we'd rather finish Todd's for 0.94 that'd be nice.


> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255828#comment-13255828
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Anybody opposed to do my patch for 0.94 and do a rewrite in trunk? Todd? Stack? 
Ted?


> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255779#comment-13255779
 ] 

Lars Hofhansl commented on HBASE-5545:
--

+1 on patch. I assume there no strange race conditions in this part of the code.

> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: gaojinchao
> Fix For: 0.90.7, 0.92.2, 0.94.0
>
> Attachments: HBASE-5545.patch
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> [2012-03-07 20:51:45,858] [WARN ] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
> [com.huawei.

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255770#comment-13255770
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Yep. It's crucial that syncTillHere is updated last.
The pendingWrites are appended strictly in order, so there is a very short race 
that might lead to sync be issued multiple time when only one was necessary (it 
seems the same race condition existed before).
So I think it is safe and low risk.

The question now is: Do this for 0.94 and then a more elaborate rewrite in 
trunk, or do a more risky rewrite for 0.94?


> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 
> 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5545) region can't be opened for a long time. Because the creating File failed.

2012-04-17 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255714#comment-13255714
 ] 

Lars Hofhansl commented on HBASE-5545:
--

If this gets done before HBASE-5782 I'll pull it in, otherwise I don't think I 
should hold up the release for this. Sounds good?


> region can't be opened for a long time. Because the creating File failed.
> -
>
> Key: HBASE-5545
> URL: https://issues.apache.org/jira/browse/HBASE-5545
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.6
>Reporter: gaojinchao
>Assignee: gaojinchao
> Fix For: 0.90.7, 0.92.2, 0.94.1
>
>
> Scenario:
> 
> 1. File is created 
> 2. But while writing data, all datanodes might have crashed. So writing data 
> will fail.
> 3. Now even if close is called in finally block, close also will fail and 
> throw the Exception because writing data failed.
> 4. After this if RS try to create the same file again, then 
> AlreadyBeingCreatedException will come.
> Suggestion to handle this scenario.
> ---
> 1. Check for the existence of the file, if exists delete the file and create 
> new file. 
> Here delete call for the file will not check whether the file is open or 
> closed.
> Overwrite Option:
> 
> 1. Overwrite option will be applicable if you are trying to overwrite a 
> closed file.
> 2. If the file is not closed, then even with overwrite option Same 
> AlreadyBeingCreatedException will be thrown.
> This is the expected behaviour to avoid the Multiple clients writing to same 
> file.
> Region server logs:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to 
> create file /hbase/test1/12c01902324218d14b17a5880f24f64b/.tmp/.regioninfo 
> for 
> DFSClient_hb_rs_158-1-131-48,20020,1331107668635_1331107669061_-252463556_25 
> on client 158.1.132.19 because current leaseholder is trying to recreate file.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1570)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1440)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1382)
> at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:658)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1137)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1133)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1131)
> at org.apache.hadoop.ipc.Client.call(Client.java:961)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:245)
> at $Proxy6.create(Unknown Source)
> at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at $Proxy6.create(Unknown Source)
> at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.(DFSClient.java:3643)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:778)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:364)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:630)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:611)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:518)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion.checkRegioninfoOnFilesystem(HRegion.java:424)
> at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:340)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2672)
> at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:2658)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:330)
> at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:116)
> at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:158)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> [2012-03-07 20:51:45,858] [WARN ] 
> [RS_OPEN_REGION-158-1-131-48,20020,1331107668635-23] 
> [com

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255308#comment-13255308
 ] 

Lars Hofhansl commented on HBASE-5782:
--

@Ram: It looks like your patch should work for this scenario, but I'd be 
generally wary about edits not being in order in the WAL.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, 
> HBASE-5782.patch
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255230#comment-13255230
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Hmm... Then that does not explain what I saw. I saw the ReplicationSource 
trying to read from a position in the file (indicated by ZK) and then the read 
failing because the dictionary was not built up.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255214#comment-13255214
 ] 

Lars Hofhansl commented on HBASE-5778:
--

If we could tail the logs it would work. We just cannot seek into an HLog in 
the middle and start reading from it.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255191#comment-13255191
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Sketched patch looks good, making sure that we do sync' a batch of write before 
the previous batch is done.

I was thinking about something similar, but still allowing multiple threads to 
write, just that a thread cannot start writing before the previous batch is 
confirmed sync'ed. I guess we actually wouldn't get more parallelism out that 
than with your approach.


> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782-sketch.txt, 5782.txt, HBASE-5782.patch
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5780) Fix race in HBase regionserver startup vs ZK SASL authentication

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255170#comment-13255170
 ] 

Lars Hofhansl commented on HBASE-5780:
--

testResetZooKeeperSession failed in 0.94 again. :(

> Fix race in HBase regionserver startup vs ZK SASL authentication
> 
>
> Key: HBASE-5780
> URL: https://issues.apache.org/jira/browse/HBASE-5780
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.92.1, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5780-v2.patch, HBASE-5780.patch, 
> TestReplicationPeer-Security-output.log, TestReplicationPeer-output.log, 
> testoutput.tar.gz
>
>
> Secure RegionServers sometimes fail to start with the following backtrace:
> 2012-03-22 17:20:16,737 FATAL 
> org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server 
> centos60-20.ent.cloudera.com,60020,1332462015929: Unexpected exception during 
> initialization, aborting
> org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = 
> NoAuth for /hbase/shutdown
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:295)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataInternal(ZKUtil.java:518)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:494)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:569)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:532)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:634)
> at java.lang.Thread.run(Thread.java:662)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255168#comment-13255168
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Is this really only a problem because of HBASE-4528 and HBASE-4487?
It seems even without HLog.appendNoSync() is is possible that one threads 
flushes an entire batch of pending write before an thread that started earlier 
can get to it.

> Edits can be appended out of seqid order since HBASE-4487
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 5782.txt, HBASE-5782.patch
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13255094#comment-13255094
 ] 

Lars Hofhansl commented on HBASE-5782:
--

The short term choices we have are:
# revert HBASE-4528, HBASE-4487, and HBASE-5541 (are there others?)
# Partially revert HBASE-2467 (or devise other ways to have strictly one thread 
flushing an HLog).

Maybe Todd as the author of HBASE-2467 could chime in... Todd?

> Not all the regions are getting assigned after the log splitting.
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5782.patch
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254925#comment-13254925
 ] 

Lars Hofhansl commented on HBASE-5782:
--

So the problem is that logSyncerThread keeps the edit in order but the syncer 
then applies the pending batches potentially out of order?

We might just need a "sync lock" to prevent two threads syncing at the same 
time. This is different from the update lock, which also prevents writing any 
edits.

> Not all the regions are getting assigned after the log splitting.
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: HBASE-5782.patch
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92<->0.94 compatibility

2012-04-16 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254864#comment-13254864
 ] 

Lars Hofhansl commented on HBASE-5795:
--

+1 on v2, are you integrating v2 with Stacks test?

> hbase-3927 breaks 0.92<->0.94 compatibility
> ---
>
> Key: HBASE-5795
> URL: https://issues.apache.org/jira/browse/HBASE-5795
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5795-v2.txt, 5795.unittest.txt
>
>
> This commit broke our 0.92/0.94 compatibility:
> {code}
> 
> r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
> HBASE-3927 display total uncompressed byte size of a region in web UI
> {code}
> I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
> cluster and rather than just digest version 1 of the HServerLoad, I get this:
> {code}
> 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
> read call parameters for client 10.4.14.38
> java.io.IOException: Error in readFields
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
> at 
> org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: A record version mismatch occured. Expecting v2, found v1
> at 
> org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
> at 
> org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
> at 
> org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
> ... 9 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92<->0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254510#comment-13254510
 ] 

Lars Hofhansl commented on HBASE-5795:
--

@Ted: I don't understand why that redundant write in 0.94 causes any problem. 
Can you elaborate? Was there an other problem in v1?


> hbase-3927 breaks 0.92<->0.94 compatibility
> ---
>
> Key: HBASE-5795
> URL: https://issues.apache.org/jira/browse/HBASE-5795
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 5795-v2.txt, 5795.unittest.txt
>
>
> This commit broke our 0.92/0.94 compatibility:
> {code}
> 
> r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
> HBASE-3927 display total uncompressed byte size of a region in web UI
> {code}
> I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
> cluster and rather than just digest version 1 of the HServerLoad, I get this:
> {code}
> 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
> read call parameters for client 10.4.14.38
> java.io.IOException: Error in readFields
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
> at 
> org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: A record version mismatch occured. Expecting v2, found v1
> at 
> org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
> at 
> org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
> at 
> org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
> ... 9 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5796) Fix our abuse of IOE: see http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254470#comment-13254470
 ] 

Lars Hofhansl commented on HBASE-5796:
--

Could be as simple as a better hierarchy of classes extending IOException (like 
DoNotRetryException).

> Fix our abuse of IOE: see 
> http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html
> ---
>
> Key: HBASE-5796
> URL: https://issues.apache.org/jira/browse/HBASE-5796
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>
> Lets make more context particular exceptions rather than throw IOEs 
> everywhere.  See Benoît's rant: 
> http://blog.tsunanet.net/2012/04/apache-hadoop-abuse-ioexception.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254433#comment-13254433
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Thanks Ram!

> Not all the regions are getting assigned after the log splitting.
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.94.0
>
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92<->0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254430#comment-13254430
 ] 

Lars Hofhansl commented on HBASE-5795:
--

Oh, you attached your 5794 patch to this issue. I removed to avoid confusing, 
when you get a chance could you attach your patch here? Then we can make a 
combined patch with that and Ted's fix.

> hbase-3927 breaks 0.92<->0.94 compatibility
> ---
>
> Key: HBASE-5795
> URL: https://issues.apache.org/jira/browse/HBASE-5795
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 5795-v1.txt
>
>
> This commit broke our 0.92/0.94 compatibility:
> {code}
> 
> r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
> HBASE-3927 display total uncompressed byte size of a region in web UI
> {code}
> I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
> cluster and rather than just digest version 1 of the HServerLoad, I get this:
> {code}
> 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
> read call parameters for client 10.4.14.38
> java.io.IOException: Error in readFields
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
> at 
> org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: A record version mismatch occured. Expecting v2, found v1
> at 
> org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
> at 
> org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
> at 
> org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
> ... 9 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92<->0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254429#comment-13254429
 ] 

Lars Hofhansl commented on HBASE-5795:
--

@Stack: I think you forgot to include the actual test in the patch :)

> hbase-3927 breaks 0.92<->0.94 compatibility
> ---
>
> Key: HBASE-5795
> URL: https://issues.apache.org/jira/browse/HBASE-5795
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 5795-v1.txt
>
>
> This commit broke our 0.92/0.94 compatibility:
> {code}
> 
> r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
> HBASE-3927 display total uncompressed byte size of a region in web UI
> {code}
> I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
> cluster and rather than just digest version 1 of the HServerLoad, I get this:
> {code}
> 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
> read call parameters for client 10.4.14.38
> java.io.IOException: Error in readFields
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
> at 
> org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: A record version mismatch occured. Expecting v2, found v1
> at 
> org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
> at 
> org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
> at 
> org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
> ... 9 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5795) hbase-3927 breaks 0.92<->0.94 compatibility

2012-04-15 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254414#comment-13254414
 ] 

Lars Hofhansl commented on HBASE-5795:
--

+1 om patch.

> hbase-3927 breaks 0.92<->0.94 compatibility
> ---
>
> Key: HBASE-5795
> URL: https://issues.apache.org/jira/browse/HBASE-5795
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.94.0
>
> Attachments: 5794.txt, 5795-v1.txt
>
>
> This commit broke our 0.92/0.94 compatibility:
> {code}
> 
> r1136686 | stack | 2011-06-16 14:18:08 -0700 (Thu, 16 Jun 2011) | 1 line
> HBASE-3927 display total uncompressed byte size of a region in web UI
> {code}
> I just tried the new RC for 0.94.  I brought up a 0.94 master on a 0.92 
> cluster and rather than just digest version 1 of the HServerLoad, I get this:
> {code}
> 2012-04-14 22:47:59,752 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
> read call parameters for client 10.4.14.38
> java.io.IOException: Error in readFields
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:684)
> at 
> org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1269)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1184)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:722)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:513)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:488)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: A record version mismatch occured. Expecting v2, found v1
> at 
> org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java:46)
> at 
> org.apache.hadoop.hbase.HServerLoad$RegionLoad.readFields(HServerLoad.java:379)
> at 
> org.apache.hadoop.hbase.HServerLoad.readFields(HServerLoad.java:686)
> at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:681)
> ... 9 more
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254263#comment-13254263
 ] 

Lars Hofhansl commented on HBASE-5256:
--

Or does it now? I think that affects the public interface.

> Use WritableUtils.readVInt() in RegionLoad.readFields()
> ---
>
> Key: HBASE-5256
> URL: https://issues.apache.org/jira/browse/HBASE-5256
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-5256.trunk.v1.patch
>
>
> Currently in.readInt() is used in RegionLoad.readFields()
> More metrics would be added to RegionLoad in the future, we should utilize 
> WritableUtils.readVInt() to reduce the amount of data exchanged between 
> Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5256) Use WritableUtils.readVInt() in RegionLoad.readFields()

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254262#comment-13254262
 ] 

Lars Hofhansl commented on HBASE-5256:
--

I see. This needs to be reverted from the 0.94 branch, otherwise it breaks 
compatibility with 0.92.

> Use WritableUtils.readVInt() in RegionLoad.readFields()
> ---
>
> Key: HBASE-5256
> URL: https://issues.apache.org/jira/browse/HBASE-5256
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-5256.trunk.v1.patch
>
>
> Currently in.readInt() is used in RegionLoad.readFields()
> More metrics would be added to RegionLoad in the future, we should utilize 
> WritableUtils.readVInt() to reduce the amount of data exchanged between 
> Master and region servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254260#comment-13254260
 ] 

Lars Hofhansl commented on HBASE-5781:
--

+1 on patch.

> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> 
>
> Key: HBASE-5781
> URL: https://issues.apache.org/jira/browse/HBASE-5781
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
>Reporter: Kristam Subba Swathi
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: hbase-5781.patch
>
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a263a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
> ERROR: Region { meta => 
> ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a263a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HB

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254211#comment-13254211
 ] 

Lars Hofhansl commented on HBASE-5781:
--

Dang :(

> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> 
>
> Key: HBASE-5781
> URL: https://issues.apache.org/jira/browse/HBASE-5781
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Kristam Subba Swathi
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.94.0
>
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a263a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
> ERROR: Region { meta => 
> ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a263a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.jav

[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254208#comment-13254208
 ] 

Lars Hofhansl commented on HBASE-5790:
--

+1 on V2

> ZKUtil deleteRecursively should be a recoverable operation
> --
>
> Key: HBASE-5790
> URL: https://issues.apache.org/jira/browse/HBASE-5790
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>  Labels: zookeeper
> Fix For: 0.96.0, 0.94.1
>
> Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch
>
>
> As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
> we can wholesale delete chunks of the zk tree and ensure that we don't have 
> any pesky recursive delete issues where we delete the children of a node, but 
> then a child joins before deletion of the parent. Even without transactions, 
> this should be the behavior, but it is possible to make it much cleaner now 
> that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5790) ZKUtil deleteRecursively should be a recoverable operation

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254207#comment-13254207
 ] 

Lars Hofhansl commented on HBASE-5790:
--

@Jesse: Was more referring to the request size that is shipped to ZK. Probably 
not a problem.


> ZKUtil deleteRecursively should be a recoverable operation
> --
>
> Key: HBASE-5790
> URL: https://issues.apache.org/jira/browse/HBASE-5790
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>  Labels: zookeeper
> Fix For: 0.96.0, 0.94.1
>
> Attachments: java_HBASE-5790-v1.patch, java_HBASE-5790.patch
>
>
> As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
> we can wholesale delete chunks of the zk tree and ensure that we don't have 
> any pesky recursive delete issues where we delete the children of a node, but 
> then a child joins before deletion of the parent. Even without transactions, 
> this should be the behavior, but it is possible to make it much cleaner now 
> that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5781) Zookeeper session got closed while trying to assign the region to RS using hbck -fix

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254205#comment-13254205
 ] 

Lars Hofhansl commented on HBASE-5781:
--

@Jon: Are you saying sink rc1 for this?

> Zookeeper session got closed while trying to assign the region to RS using 
> hbck -fix
> 
>
> Key: HBASE-5781
> URL: https://issues.apache.org/jira/browse/HBASE-5781
> Project: HBase
>  Issue Type: Bug
>  Components: hbck
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Kristam Subba Swathi
>Assignee: Jonathan Hsieh
>Priority: Critical
> Fix For: 0.94.0
>
>
> After running the hbck in the cluster ,it is found that one region is not 
> assigned
> So the hbck -fix is used to fix this 
> But the assignment didnt happen since the zookeeper session is closed
> Please find the attached trace for more details
> -
> Trying to fix unassigned region...
> 12/04/03 11:02:57 INFO util.HBaseFsckRepair: Region still in transition, 
> waiting for it to become assigned: {NAME => 
> 'ufdr,002300,179123498.00871fbd7583512e12c4eb38e900be8d.', STARTKEY => 
> '002300', ENDKEY => '002311', ENCODED => 00871fbd7583512e12c4eb38e900be8d,}
> 12/04/03 11:02:58 INFO client.HConnectionManager$HConnectionImplementation: 
> Closed zookeeper sessionid=0x236738a263a
> 12/04/03 11:02:58 INFO zookeeper.ZooKeeper: Session: 0x236738a263a closed
> ERROR: Region { meta => 
> ufdr,010444,179123857.01594219211d0035b9586f98954462e1., hdfs => 
> hdfs://10.18.40.25:9000/hbase/ufdr/01594219211d0035b9586f98954462e1, deployed 
> => } not deployed on any region server.
> Trying to fix unassigned region...
> 12/04/03 11:02:58 INFO zookeeper.ClientCnxn: EventThread shut down
> 12/04/03 11:02:58 WARN zookeeper.ZKUtil: hconnection-0x236738a263a Unable 
> to set watcher on znode (/hbase)
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.HBaseAdmin.assign(HBaseAdmin.java:1325)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.forceOfflineInZK(HBaseFsckRepair.java:109)
> at 
> org.apache.hadoop.hbase.util.HBaseFsckRepair.fixUnassigned(HBaseFsckRepair.java:92)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1235)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1351)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1114)
> at 
> org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:356)
> at org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:375)
> at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:2894)
> 12/04/03 11:02:58 ERROR zookeeper.ZooKeeperWatcher: 
> hconnection-0x236738a263a Received unexpected KeeperException, 
> re-throwing exception
> org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
> = Session expired for /hbase
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:150)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:263)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.checkIfBaseNodeAvailable(ZooKeeperNodeTracker.java:208)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.checkIfBaseNodeAvailable(HConnectionManager.java:695)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:626)
> at org.apache.hadoop.hbase.client.HBaseAdmin.getMaster(HBaseAdmin.java:211)
> at org.apache.hadoop.hbase.client.

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-14 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13254204#comment-13254204
 ] 

Lars Hofhansl commented on HBASE-5778:
--

@Ted: Unfortunately it is not as simple as that. As I tried to explain above, 
the ReplicationSource reads from the WAL files at offsets that are stored in 
ZK. This does not work any longer, as you can no longer start reading the WAL 
at an offset. The files need to be read from the beginning to build up the 
dictionary.


> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253996#comment-13253996
 ] 

Lars Hofhansl commented on HBASE-5782:
--

And is there anything interesting in the log files?

> Not all the regions are getting assigned after the log splitting.
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Priority: Critical
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253995#comment-13253995
 ] 

Lars Hofhansl commented on HBASE-5778:
--

This fundamentally break replication.

The problem above is actually that the HLogKey and WALEdit after being read 
from a compressed HLog have the compression context set, and hence this will be 
used to compress them when sent over the wire to the sink. Of course the sink 
does not know how to uncompress.

So I just set the compression context to null in ReplicationSource.

With that hurdle out of the way, I find that seeking to a specific position in 
the HLog (the position stored in ZK) does not work, because the dictionary is 
not build up (compressed HLogs always need to read from the beginning).

Not sure how to fix the 2nd part.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253990#comment-13253990
 ] 

Lars Hofhansl commented on HBASE-5782:
--

You tried multiple times in 0.92.1, and no problem?
Any chance to capture this in a test? (Although that might be difficult)

> Not all the regions are getting assigned after the log splitting.
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Priority: Critical
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5790) ZKUtil deleteRecurisively should be a recoverable operation

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253986#comment-13253986
 ] 

Lars Hofhansl commented on HBASE-5790:
--

Patch looks good.
Will there be any request size problem if the subtree to delete is large?
Super minor nit: Should point out in Javadoc that deleteRecursively and an 
idempotent operation.


> ZKUtil deleteRecurisively should be a recoverable operation
> ---
>
> Key: HBASE-5790
> URL: https://issues.apache.org/jira/browse/HBASE-5790
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jesse Yates
>Assignee: Jesse Yates
>  Labels: zookeeper
> Fix For: 0.96.0, 0.94.1
>
> Attachments: java_HBASE-5790.patch
>
>
> As of 3.4.3 Zookeeper now has full, multi-operation transaction. This means 
> we can wholesale delete chunks of the zk tree and ensure that we don't have 
> any pesky recursive delete issues where we delete the children of a node, but 
> then a child joins before deletion of the parent. Even without transactions, 
> this should be the behavior, but it is possible to make it much cleaner now 
> that we have this new feature in zk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253886#comment-13253886
 ] 

Lars Hofhansl commented on HBASE-5782:
--

Did you have HBASE-5778 enabled?

> Not all the regions are getting assigned after the log splitting.
> -
>
> Key: HBASE-5782
> URL: https://issues.apache.org/jira/browse/HBASE-5782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 0.94.0
>Reporter: Gopinathan A
>Priority: Critical
>
> Create a table with 1000 splits, after the region assignemnt, kill the 
> regionserver wich contains META table.
> Here few regions are missing after the log splitting and region assigment. 
> HBCK report shows multiple region holes are got created.
> Same scenario was verified mulitple times in 0.92.1, no issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253839#comment-13253839
 ] 

Lars Hofhansl commented on HBASE-5604:
--

OK. Done. I'll use that build as RC1.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253824#comment-13253824
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Stack and Ted: I'll just remove that test. The HFile pretty printer will show 
time in the same TZ as WALPlayer would parse it, which is still useful.
I'll not file an extra jira but post an addendum to this one.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253765#comment-13253765
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Sigh... That's exactly what I was afraid about when adding data parsing.
What timezone is used to store the WAL edit? Is it GMT? If so, the date should 
be interpreted as GMT.
If the data in the WAL is always the local TZ, I should remove the date 
verification test (which in a sense is pointless anyway, because it just tests 
whether SimpleDateFormat works).

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253614#comment-13253614
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Oh I see. The KVs are only decompressed when read.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253588#comment-13253588
 ] 

Lars Hofhansl commented on HBASE-5778:
--

I still don't understand why this is a problem with replication. J-D do you 
have any insights?

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253549#comment-13253549
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Arghh... OK. So:
* in 0.94+ this is fixed, correct?
* you like to backport HBASE-5454 to 0.90 and 0.92, right?

So let's close this one then?

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, 5677-proposal.txt, 5677-proposal.txt, 
> HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
> surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-13 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253540#comment-13253540
 ] 

Lars Hofhansl commented on HBASE-5778:
--

NP... I thought it was ready.
I am surprised that ReplicationSink has a problem, as it is the 
ReplicationSource that reads from the HLog.


> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253118#comment-13253118
 ] 

Lars Hofhansl commented on HBASE-5778:
--

mvn failed with an OOME. Let's revert this change, until we track these issues 
down.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253095#comment-13253095
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Done. Thanks Ted.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253094#comment-13253094
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Yes, should do an addendum. Will mark as large.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253091#comment-13253091
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Similar... There're some other issues. I'll have a patch soon.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5778.addendum, HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253065#comment-13253065
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Have a fix for TestHLog, working on TestHLogSplit

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253063#comment-13253063
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Then there's FaultySequenceFileLogReader, which does not do the right thing.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253061#comment-13253061
 ] 

Lars Hofhansl commented on HBASE-5778:
--

TestHLog.testAppendClose() uses this to read back the WALEdits:
{code}
// Make sure you can read all the content
SequenceFile.Reader reader
  = new SequenceFile.Reader(this.fs, walPath, this.conf);
{code}
Well, dah, that does not work.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253058#comment-13253058
 ] 

Lars Hofhansl commented on HBASE-5778:
--

Yeah... The failures in TestHLog are because of this. Need to rollback or 
figure out what the problem is. Probably test related.

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253053#comment-13253053
 ] 

Lars Hofhansl commented on HBASE-5778:
--

I see a bunch of suspicious test failures now:
{code}
java.lang.NegativeArraySizeException
at 
org.apache.hadoop.hbase.regionserver.wal.HLogKey.readFields(HLogKey.java:305)
{code}


> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253046#comment-13253046
 ] 

Lars Hofhansl commented on HBASE-5677:
--

@xufeng: I made a trunk patch, so that I can get a HadoopQA test run.

All the failures are unrelated, though. In fact they are all because of 
negative array sizes if WALEdit de-serialization, which is suspicious.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, 5677-proposal.txt, 5677-proposal.txt, 
> HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
> surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252953#comment-13252953
 ] 

Lars Hofhansl commented on HBASE-5677:
--

TestImportTsv passes as well.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, 5677-proposal.txt, 
> HBASE-5677-90-v1.patch, surefire-report_no_patched_v1.html, 
> surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252950#comment-13252950
 ] 

Lars Hofhansl commented on HBASE-5677:
--

I ran TestReplication with this patch (5577-proposal.txt) and it passed fine.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5778) Turn on WAL compression by default

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252939#comment-13252939
 ] 

Lars Hofhansl commented on HBASE-5778:
--

+1 on patch

> Turn on WAL compression by default
> --
>
> Key: HBASE-5778
> URL: https://issues.apache.org/jira/browse/HBASE-5778
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Blocker
> Fix For: 0.94.0, 0.96.0
>
> Attachments: HBASE-5778.patch
>
>
> I ran some tests to verify if WAL compression should be turned on by default.
> For a use case where it's not very useful (values two order of magnitude 
> bigger than the keys), the insert time wasn't different and the CPU usage 15% 
> higher (150% CPU usage VS 130% when not compressing the WAL).
> When values are smaller than the keys, I saw a 38% improvement for the insert 
> run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure 
> WAL compression accounts for all the additional CPU usage, it might just be 
> that we're able to insert faster and we spend more time in the MemStore per 
> second (because our MemStores are bad when they contain tens of thousands of 
> values).
> Those are two extremes, but it shows that for the price of some CPU we can 
> save a lot. My machines have 2 quads with HT, so I still had a lot of idle 
> CPUs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252934#comment-13252934
 ] 

Lars Hofhansl commented on HBASE-5604:
--

And I did insert the spaces before commit :)

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252892#comment-13252892
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Ted: Are you OK with latest patch? If so, I'll commit today.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5604-v10.txt, 5604-v11.txt, 5604-v4.txt, 5604-v6.txt, 
> 5604-v7.txt, 5604-v8.txt, 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5775) ZKUtil doesn't handle deleteRecurisively cleanly

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252889#comment-13252889
 ] 

Lars Hofhansl commented on HBASE-5775:
--

Thanks for the patch.

> ZKUtil doesn't handle deleteRecurisively cleanly
> 
>
> Key: HBASE-5775
> URL: https://issues.apache.org/jira/browse/HBASE-5775
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.94.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 0.94.0, 0.96.0
>
> Attachments: java_HBASE-5775.patch
>
>
> ZKUtil.deleteNodeRecursively()'s contract says that it handles deletion of 
> the node and all its children. However, nothing is mentioned as to what 
> happens if the node you are attempting to delete doesn't actually exist. 
> Turns out, it throws a null pointer exception. I
> 'm proposing that we change the code s.t. it handles the case where the 
> parent is already gone and exits cleanly, rather than failing horribly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5775) ZKUtil doesn't handle deleteRecurisively cleanly

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252831#comment-13252831
 ] 

Lars Hofhansl commented on HBASE-5775:
--

Will commit when HadoopQA is done.

> ZKUtil doesn't handle deleteRecurisively cleanly
> 
>
> Key: HBASE-5775
> URL: https://issues.apache.org/jira/browse/HBASE-5775
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.94.0
>Reporter: Jesse Yates
>Assignee: Jesse Yates
> Fix For: 0.94.0, 0.96.0
>
> Attachments: java_HBASE-5775.patch
>
>
> ZKUtil.deleteNodeRecursively()'s contract says that it handles deletion of 
> the node and all its children. However, nothing is mentioned as to what 
> happens if the node you are attempting to delete doesn't actually exist. 
> Turns out, it throws a null pointer exception. I
> 'm proposing that we change the code s.t. it handles the case where the 
> parent is already gone and exits cleanly, rather than failing horribly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5774) Add documentation for WALPlayer to HBase reference guide.

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252755#comment-13252755
 ] 

Lars Hofhansl commented on HBASE-5774:
--

Thanks Doug. I think I'll just roll this up into parent.

> Add documentation for WALPlayer to HBase reference guide.
> -
>
> Key: HBASE-5774
> URL: https://issues.apache.org/jira/browse/HBASE-5774
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5774.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252726#comment-13252726
 ] 

Lars Hofhansl commented on HBASE-3443:
--

@Stack: Wasn't sure about this one. Didn't feel right to add this to a 
"performance release" :)

You think this should go into 0.94? Might be good to have this correctness fix 
early.

> ICV optimization to look in memstore first and then store files (HBASE-3082) 
> does not work when deletes are in the mix
> --
>
> Key: HBASE-3443
> URL: https://issues.apache.org/jira/browse/HBASE-3443
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
> 0.92.0, 0.92.1
>Reporter: Kannan Muthukkaruppan
>Assignee: Lars Hofhansl
>Priority: Critical
>  Labels: corruption
> Fix For: 0.96.0
>
> Attachments: 3443.txt
>
>
> For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
> first, and only if not present in the memstore then check the store files. In 
> the presence of deletes, the above optimization is not reliable.
> If the column is marked as deleted in the memstore, one should not look 
> further into the store files. But currently, the code does so.
> Sample test code outline:
> {code}
> admin.createTable(desc)
> table = HTable.new(conf, tableName)
> table.incrementColumnValue(Bytes.toBytes("row"), cf1name, 
> Bytes.toBytes("column"), 5);
> admin.flush(tableName)
> sleep(2)
> del = Delete.new(Bytes.toBytes("row"))
> table.delete(del)
> table.incrementColumnValue(Bytes.toBytes("row"), cf1name, 
> Bytes.toBytes("column"), 5);
> get = Get.new(Bytes.toBytes("row"))
> keyValues = table.get(get).raw()
> keyValues.each do |keyValue|
>   puts "Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}";
> end
> {code}
> The above prints:
> {code}
> Expect 5; Got Value=10
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252533#comment-13252533
 ] 

Lars Hofhansl commented on HBASE-5677:
--

@Stack: This seems like a good addition to 0.94.
Let's say if we can track down the test failures today we'll put this in 0.94.0 
otherwise 0.94.1

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tool to replay WAL files

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252530#comment-13252530
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Since this is only new code I propose this for 0.94.0 as well.

> M/R tool to replay WAL files
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> 5604-v9.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-12 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252516#comment-13252516
 ] 

Lars Hofhansl commented on HBASE-5677:
--

@xufeng: Are you saying you change is good to go in?


> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252095#comment-13252095
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Even the created HFileOutputFormat use the table in HBase to find out about 
splits. We will rarely bulk import into an empty table that is not 
pre-split...? (I'm assuming here, you folks at Cloudera see many more use cases 
that I do)

If you believe strongly that we should add this, let's do it :)
We can do Import in a separate patch, or not do that as we see fit.


> ImportTsv does not check for table existence 
> -
>
> Key: HBASE-5741
> URL: https://issues.apache.org/jira/browse/HBASE-5741
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.4
>Reporter: Clint Heath
>Assignee: Himanshu Vashishtha
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
> HBase-5741.patch
>
>
> The usage statement for the "importtsv" command to hbase claims this:
> "Note: if you do not use this option, then the target table must already 
> exist in HBase" (in reference to the "importtsv.bulk.output" command-line 
> option)
> The truth is, the table must exist no matter what, importtsv cannot and will 
> not create it for you.
> This is the case because the createSubmittableJob method of ImportTsv does 
> not even attempt to check if the table exists already, much less create it:
> (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
> 305 HTable table = new HTable(conf, tableName);
> The HTable method signature in use there assumes the table exists and runs a 
> meta scan on it:
> (From org.apache.hadoop.hbase.client.HTable.java)
> 142 * Creates an object to access a HBase table.
> ...
> 151 public HTable(Configuration conf, final String tableName)
> What we should do inside of createSubmittableJob is something similar to what 
> the "completebulkloads" command would do:
> (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
> 690 boolean tableExists = this.doesTableExist(tableName);
> 691 if (!tableExists) this.createTable(tableName,dirPath);
> Currently the docs are misleading, the table in fact must exist prior to 
> running importtsv. We should check if it exists rather than assume it's 
> already there and throw the below exception:
> 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
> Encountered problems when prefetch META table: 
> org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
> table: myTable2, row=myTable2,,99
>   at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252076#comment-13252076
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Import does not do that either. At least we should be consistent between the 
various importing tools.
It seems overkill to burden these tools with that.

That said, I am not opposed, just questioning the usefulness.

> ImportTsv does not check for table existence 
> -
>
> Key: HBASE-5741
> URL: https://issues.apache.org/jira/browse/HBASE-5741
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.4
>Reporter: Clint Heath
>Assignee: Himanshu Vashishtha
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
> HBase-5741.patch
>
>
> The usage statement for the "importtsv" command to hbase claims this:
> "Note: if you do not use this option, then the target table must already 
> exist in HBase" (in reference to the "importtsv.bulk.output" command-line 
> option)
> The truth is, the table must exist no matter what, importtsv cannot and will 
> not create it for you.
> This is the case because the createSubmittableJob method of ImportTsv does 
> not even attempt to check if the table exists already, much less create it:
> (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
> 305 HTable table = new HTable(conf, tableName);
> The HTable method signature in use there assumes the table exists and runs a 
> meta scan on it:
> (From org.apache.hadoop.hbase.client.HTable.java)
> 142 * Creates an object to access a HBase table.
> ...
> 151 public HTable(Configuration conf, final String tableName)
> What we should do inside of createSubmittableJob is something similar to what 
> the "completebulkloads" command would do:
> (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
> 690 boolean tableExists = this.doesTableExist(tableName);
> 691 if (!tableExists) this.createTable(tableName,dirPath);
> Currently the docs are misleading, the table in fact must exist prior to 
> running importtsv. We should check if it exists rather than assume it's 
> already there and throw the below exception:
> 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
> Encountered problems when prefetch META table: 
> org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
> table: myTable2, row=myTable2,,99
>   at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5741) ImportTsv does not check for table existence

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252064#comment-13252064
 ] 

Lars Hofhansl commented on HBASE-5741:
--

Sorry for chiming in late here, but do we actually want ImportTsv to create the 
table for us?
Personally I'd rather have it fail, which forces me to think about how I want 
create the table (pre split, compression, etc).


> ImportTsv does not check for table existence 
> -
>
> Key: HBASE-5741
> URL: https://issues.apache.org/jira/browse/HBASE-5741
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.4
>Reporter: Clint Heath
>Assignee: Himanshu Vashishtha
> Fix For: 0.94.0, 0.96.0
>
> Attachments: 5741-94.txt, 5741-v3.txt, HBase-5741-v2.patch, 
> HBase-5741.patch
>
>
> The usage statement for the "importtsv" command to hbase claims this:
> "Note: if you do not use this option, then the target table must already 
> exist in HBase" (in reference to the "importtsv.bulk.output" command-line 
> option)
> The truth is, the table must exist no matter what, importtsv cannot and will 
> not create it for you.
> This is the case because the createSubmittableJob method of ImportTsv does 
> not even attempt to check if the table exists already, much less create it:
> (From org.apache.hadoop.hbase.mapreduce.ImportTsv.java)
> 305 HTable table = new HTable(conf, tableName);
> The HTable method signature in use there assumes the table exists and runs a 
> meta scan on it:
> (From org.apache.hadoop.hbase.client.HTable.java)
> 142 * Creates an object to access a HBase table.
> ...
> 151 public HTable(Configuration conf, final String tableName)
> What we should do inside of createSubmittableJob is something similar to what 
> the "completebulkloads" command would do:
> (Taken from org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.java)
> 690 boolean tableExists = this.doesTableExist(tableName);
> 691 if (!tableExists) this.createTable(tableName,dirPath);
> Currently the docs are misleading, the table in fact must exist prior to 
> running importtsv. We should check if it exists rather than assume it's 
> already there and throw the below exception:
> 12/03/14 17:15:42 WARN client.HConnectionManager$HConnectionImplementation: 
> Encountered problems when prefetch META table: 
> org.apache.hadoop.hbase.TableNotFoundException: Cannot find row in .META. for 
> table: myTable2, row=myTable2,,99
>   at 
> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:150)
> ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252057#comment-13252057
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Sigh... I think these are test-related issues. In a real cluster the Master is 
useless unless initialized (although I wouldn't know if that holds for a backup 
master as well).
I'll have a look at these when I get a chance.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5677-proposal.txt, HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13252050#comment-13252050
 ] 

Lars Hofhansl commented on HBASE-3443:
--

I also verified that the test included in the patch fails without the other 
changes.

> ICV optimization to look in memstore first and then store files (HBASE-3082) 
> does not work when deletes are in the mix
> --
>
> Key: HBASE-3443
> URL: https://issues.apache.org/jira/browse/HBASE-3443
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
> 0.92.0, 0.92.1
>Reporter: Kannan Muthukkaruppan
>Priority: Critical
>  Labels: corruption
> Attachments: 3443.txt
>
>
> For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
> first, and only if not present in the memstore then check the store files. In 
> the presence of deletes, the above optimization is not reliable.
> If the column is marked as deleted in the memstore, one should not look 
> further into the store files. But currently, the code does so.
> Sample test code outline:
> {code}
> admin.createTable(desc)
> table = HTable.new(conf, tableName)
> table.incrementColumnValue(Bytes.toBytes("row"), cf1name, 
> Bytes.toBytes("column"), 5);
> admin.flush(tableName)
> sleep(2)
> del = Delete.new(Bytes.toBytes("row"))
> table.delete(del)
> table.incrementColumnValue(Bytes.toBytes("row"), cf1name, 
> Bytes.toBytes("column"), 5);
> get = Get.new(Bytes.toBytes("row"))
> keyValues = table.get(get).raw()
> keyValues.each do |keyValue|
>   puts "Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}";
> end
> {code}
> The above prints:
> {code}
> Expect 5; Got Value=10
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-11 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251880#comment-13251880
 ] 

Lars Hofhansl commented on HBASE-5677:
--

@xufeng: OK, thanks. This is caused by the client?

Back to my previous comment. I think a good definition for declaring the master 
"running" is when it is fully initialized. I'd still just add this extra check 
to isMasterRunning. Is anything failing with that? If so I'd like to understand 
why?

BTW. This is the only issue holding up the next RC for the first 0.94 release.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5755) Region sever looking for master forever with cached stale data.

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251327#comment-13251327
 ] 

Lars Hofhansl commented on HBASE-5755:
--

Looks like MasterAddressTracker.java was moved from 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase to 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/zookeeper in 0.96

> Region sever looking for master forever with cached stale data.
> ---
>
> Key: HBASE-5755
> URL: https://issues.apache.org/jira/browse/HBASE-5755
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.96.0
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: hbase-5755.patch
>
>
> When the master address tracker doesn't have the master address ZK data, or 
> the cached data is wrong, region server should not use the cached data.
> It should pull the data from ZK directly again.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5756) we can change defalult File Appender to RFA instead of DRFA.

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251315#comment-13251315
 ] 

Lars Hofhansl commented on HBASE-5756:
--

Anybody can still change the log4j.properties to get the desired behavior. I 
see no need to put into 0.94.

That said, the change in HBASE-5655 is simple, if you believe strongly I'll put 
it into 0.94.


> we can change defalult File Appender to RFA instead of DRFA.
> 
>
> Key: HBASE-5756
> URL: https://issues.apache.org/jira/browse/HBASE-5756
> Project: HBase
>  Issue Type: Bug
>Reporter: rohithsharma
>Priority: Minor
>
> This can be a point of concern when on a certain day the logging happens more 
> because of more and more activity. In that case the log file for that day can 
> grow huge. These logs can not be opened for analysis since size is more.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251189#comment-13251189
 ] 

Lars Hofhansl commented on HBASE-5677:
--

bq. public boolean isMasterRunning() { return !isStopped() && isInitialized(); 
} But Some class like HMerge,this tool just care the master is running or not.

Is that actually a problem? HMerge would need to wait until the master is 
initialized. Seems generally a better a better condition for "running" than 
just having the process up.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251053#comment-13251053
 ] 

Lars Hofhansl commented on HBASE-5604:
--

@Stack: Yeah, probably. You'd vote for leaving it the way it is?

> M/R tools to replay WAL files
> -
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251006#comment-13251006
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Is this at all similar to HBASE-5615?
Maybe here too, there is no point holding up 0.94.0 for this.

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251002#comment-13251002
 ] 

Lars Hofhansl commented on HBASE-5604:
--

The parsing would be done with SimpleDateFormat. I was going a bit back and 
forth about this. Could accept both actually, try SimpleDateFormat first, if 
there's a parse error use ms.
Not sure which part would do that parsing. Can be done in WALPlayer, or down at 
HLogInputFormat.

> M/R tools to replay WAL files
> -
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5728) Methods Missing in HTableInterface

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250985#comment-13250985
 ] 

Lars Hofhansl commented on HBASE-5728:
--

{code}
public int getScannerCaching();
public void setScannerCaching(int scannerCaching);
{code}
These two are deprecated (in trunk at least), let's not add them to the 
interface.

> Methods Missing in HTableInterface
> --
>
> Key: HBASE-5728
> URL: https://issues.apache.org/jira/browse/HBASE-5728
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Reporter: Bing Li
>
> Dear all,
> I found some methods existed in HTable were not in HTableInterface.
>setAutoFlush
>setWriteBufferSize
>...
> In most cases, I manipulate HBase through HTableInterface from HTablePool. If 
> I need to use the above methods, how to do that?
> I am considering writing my own table pool if no proper ways. Is it fine?
> Thanks so much!
> Best regards,
> Bing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250968#comment-13250968
 ] 

Lars Hofhansl commented on HBASE-3443:
--

For 0.96, though.
I can take this. Can't promise to work on it today, though.
Had checked out the code a while ago to make ICV correct w.r.t. to WAL updates, 
this involves mostly removing some crufty code.


> ICV optimization to look in memstore first and then store files (HBASE-3082) 
> does not work when deletes are in the mix
> --
>
> Key: HBASE-3443
> URL: https://issues.apache.org/jira/browse/HBASE-3443
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 
> 0.92.0, 0.92.1
>Reporter: Kannan Muthukkaruppan
>Priority: Critical
>  Labels: corruption
>
> For incrementColumnValue() HBASE-3082 adds an optimization to check memstores 
> first, and only if not present in the memstore then check the store files. In 
> the presence of deletes, the above optimization is not reliable.
> If the column is marked as deleted in the memstore, one should not look 
> further into the store files. But currently, the code does so.
> Sample test code outline:
> {code}
> admin.createTable(desc)
> table = HTable.new(conf, tableName)
> table.incrementColumnValue(Bytes.toBytes("row"), cf1name, 
> Bytes.toBytes("column"), 5);
> admin.flush(tableName)
> sleep(2)
> del = Delete.new(Bytes.toBytes("row"))
> table.delete(del)
> table.incrementColumnValue(Bytes.toBytes("row"), cf1name, 
> Bytes.toBytes("column"), 5);
> get = Get.new(Bytes.toBytes("row"))
> keyValues = table.get(get).raw()
> keyValues.each do |keyValue|
>   puts "Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}";
> end
> {code}
> The above prints:
> {code}
> Expect 5; Got Value=10
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-10 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250959#comment-13250959
 ] 

Lars Hofhansl commented on HBASE-5604:
--

* Let's not overkill on the exceptions. This is an inner class of WALPlayer, 
WALPlayer will always pass the correct arguments. Maybe I'll make the mapper 
classed private and remove all exception handling.
* Thought about using SimpleDateFormat, then punted. I guess you're right, 
should make it bit more user friendly.
* The TODO was left over. Not sure that even makes sense.


> M/R tools to replay WAL files
> -
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, 
> HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250445#comment-13250445
 ] 

Lars Hofhansl commented on HBASE-5677:
--

Seems this should in 0.94.0. Agreed?

> The master never does balance because duplicate openhandled the one region
> --
>
> Key: HBASE-5677
> URL: https://issues.apache.org/jira/browse/HBASE-5677
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6
> Environment: 0.90
>Reporter: xufeng
>Assignee: xufeng
> Attachments: HBASE-5677-90-v1.patch, 
> surefire-report_no_patched_v1.html, surefire-report_patched_v1.html
>
>
> If region be assigned When the master is doing initialization(before do 
> processFailover),the region will be duplicate openhandled.
> because the unassigned node in zookeeper will be handled again in 
> AssignmentManager#processFailover()
> it cause the region in RIT,thus the master never does balance.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) M/R tools to replay WAL files

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250439#comment-13250439
 ] 

Lars Hofhansl commented on HBASE-5604:
--

Thanks Ted.

* I'll add Javadoc.
* index of 0 is used here, since when creating HFiles for bulk import only a 
single table is currently allowed (that is also documented in usage(), but 
perhaps not clearly enough...?). I'll add a comment to that extent.
* That the two arrays are of the same size if guaranteed in 
WALPlayer.createSubmittableJob(), but perhaps it is better to double check here.
* Checking heapSize() seems unnecessary. After all, this is single WALEdit.


> M/R tools to replay WAL files
> -
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5604) HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250336#comment-13250336
 ] 

Lars Hofhansl commented on HBASE-5604:
--

TestForceCacheImportantBlocks passes locally. I'm ready to commit.

> HLog replay tool that generates HFiles for use by LoadIncrementalHFiles.
> 
>
> Key: HBASE-5604
> URL: https://issues.apache.org/jira/browse/HBASE-5604
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, HLog-5604-v3.txt
>
>
> Just an idea I had. Might be useful for restore of a backup using the HLogs.
> This could an M/R (with a mapper per HLog file).
> The tool would get a timerange and a (set of) table(s). We'd pick the right 
> HLogs based on time before the M/R job is started and then have a mapper per 
> HLog file.
> The mapper would then go through the HLog, filter all WALEdits that didn't 
> fit into the time range or are not any of the tables and then uses 
> HFileOutputFormat to generate HFiles.
> Would need to indicate the splits we want, probably from a live table.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5615) the master never does balance because of balancing the parent region

2012-04-09 Thread Lars Hofhansl (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250069#comment-13250069
 ] 

Lars Hofhansl commented on HBASE-5615:
--

Thanks Ram... I'll hold up 0.94 for this (unless you think that unnecessary).

> the master never does balance because of balancing the parent region
> 
>
> Key: HBASE-5615
> URL: https://issues.apache.org/jira/browse/HBASE-5615
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.7
>Reporter: xufeng
>Assignee: xufeng
>Priority: Critical
> Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
> Attachments: 5615-trunk.txt, HBASE-5615-90.patch, HBASE-5615.patch, 
> NoPatched-surefire-report-5615-90.html, Patched_surefire-report-5615-90.html
>
>
> the master never do balance becauseof when master do rebuildUserRegions()，it 
> will add the parent region into  AssignmentManager#servers,
> if balancer let the parent region to move,the parent will in RIT forever.thus 
> balance will never be executed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1168 matches

Mail list logo