from:"Prakash Khemani $Commented$ $JIRA$"

[jira] [Commented] (HBASE-5618) SplitLogManager - prevent unnecessary attempts to resubmits

2012-04-04 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246874#comment-13246874
 ] 

Prakash Khemani commented on HBASE-5618:


TestColumnSeeking isn't failing for me.

 SplitLogManager - prevent unnecessary attempts to resubmits
 ---

 Key: HBASE-5618
 URL: https://issues.apache.org/jira/browse/HBASE-5618
 Project: HBase
  Issue Type: Improvement
  Components: wal, zookeeper
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: 
 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch, 
 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch, 
 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch


 Currently once a watch fires that the task node has been updated (hearbeated) 
 by the worker, the splitlogmanager still quite some time before it updates 
 the last heard from time. This is because the manager currently schedules 
 another getDataSetWatch() and only after that finishes will it update the 
 task's last heard from time.
 This leads to a large number of zk-BadVersion warnings when resubmission is 
 continuously attempted and it fails.
 Two changes should be made
 (1) On a resubmission failure because of BadVersion the task's lastUpdate 
 time should get upped.
 (2) The task's lastUpdate time should get upped as soon as the 
 nodeDataChanged() watch fires and without waiting for getDataSetWatch() to 
 complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5618) SplitLogManager - prevent unnecessary attempts to resubmits

2012-04-03 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245790#comment-13245790
 ] 

Prakash Khemani commented on HBASE-5618:


sorry for the test failure. fixed and verified.

 SplitLogManager - prevent unnecessary attempts to resubmits
 ---

 Key: HBASE-5618
 URL: https://issues.apache.org/jira/browse/HBASE-5618
 Project: HBase
  Issue Type: Improvement
  Components: wal, zookeeper
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Attachments: 
 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch, 
 0001-HBASE-5618-SplitLogManager-prevent-unnecessary-attem.patch


 Currently once a watch fires that the task node has been updated (hearbeated) 
 by the worker, the splitlogmanager still quite some time before it updates 
 the last heard from time. This is because the manager currently schedules 
 another getDataSetWatch() and only after that finishes will it update the 
 task's last heard from time.
 This leads to a large number of zk-BadVersion warnings when resubmission is 
 continuously attempted and it fails.
 Two changes should be made
 (1) On a resubmission failure because of BadVersion the task's lastUpdate 
 time should get upped.
 (2) The task's lastUpdate time should get upped as soon as the 
 nodeDataChanged() watch fires and without waiting for getDataSetWatch() to 
 complete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-04-03 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245797#comment-13245797
 ] 

Prakash Khemani commented on HBASE-5606:



Making the deletes synchronous doesn't  theoretically remove the race 
condition. A master could send the delete to the zk-server it is connected to 
and die. The next master can (theoretically) still run into the pending delete 
race.





 SplitLogManger async delete node hangs log splitting when ZK connection is 
 lost 
 

 Key: HBASE-5606
 URL: https://issues.apache.org/jira/browse/HBASE-5606
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Gopinathan A
Assignee: Prakash Khemani
Priority: Critical
 Fix For: 0.92.2

 Attachments: 
 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 
 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch


 1. One rs died, the servershutdownhandler found it out and started the 
 distributed log splitting;
 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
 deleted asynchronously;
 3. Servershutdownhandler retried the log splitting;
 4. The asynchronously deletion in step 2 finally happened for new task
 5. This made the SplitLogManger in hanging state.
 This leads to .META. region not assigened for long time
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(55413,79):2012-03-14 
 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89303,79):2012-03-14 
 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(80417,99):2012-03-14 
 19:34:31,196 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89456,99):2012-03-14 
 19:34:32,497 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-26 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238794#comment-13238794
 ] 

Prakash Khemani commented on HBASE-5606:


@Jimmy This is similar to HBASE-5081 w.r.t what goes wrong - a pending delete 
creates havoc on the next create. But it is different from HBASE-5081 because 
the pending Delete is created at a different point in the code - in the 
timeoutMonitor and not when the task actually fails ...

 SplitLogManger async delete node hangs log splitting when ZK connection is 
 lost 
 

 Key: HBASE-5606
 URL: https://issues.apache.org/jira/browse/HBASE-5606
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Gopinathan A
Priority: Critical
 Fix For: 0.92.2

 Attachments: 
 0001-HBASE-5606-SplitLogManger-async-delete-node-hangs-lo.patch, 5606.txt


 1. One rs died, the servershutdownhandler found it out and started the 
 distributed log splitting;
 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
 deleted asynchronously;
 3. Servershutdownhandler retried the log splitting;
 4. The asynchronously deletion in step 2 finally happened for new task
 5. This made the SplitLogManger in hanging state.
 This leads to .META. region not assigened for long time
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(55413,79):2012-03-14 
 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89303,79):2012-03-14 
 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(80417,99):2012-03-14 
 19:34:31,196 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89456,99):2012-03-14 
 19:34:32,497 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-21 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234592#comment-13234592
 ] 

Prakash Khemani commented on HBASE-5606:


@Chinna

It is the TimeoutMonitor that causes the so many Deletes to be queued.

The fix will be the following

In TimeoutMonitor do not call getDataSetWatch() if the task has already failed.

Ignore the call to getDataSetWatch() if there is already a pending 
getDataSetWatch against the task.

Thanks for finding this issue.

 SplitLogManger async delete node hangs log splitting when ZK connection is 
 lost 
 

 Key: HBASE-5606
 URL: https://issues.apache.org/jira/browse/HBASE-5606
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Gopinathan A
Priority: Critical
 Fix For: 0.92.2


 1. One rs died, the servershutdownhandler found it out and started the 
 distributed log splitting;
 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
 deleted asynchronously;
 3. Servershutdownhandler retried the log splitting;
 4. The asynchronously deletion in step 2 finally happened for new task
 5. This made the SplitLogManger in hanging state.
 This leads to .META. region not assigened for long time
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(55413,79):2012-03-14 
 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89303,79):2012-03-14 
 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(80417,99):2012-03-14 
 19:34:31,196 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89456,99):2012-03-14 
 19:34:32,497 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

2012-03-21 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235297#comment-13235297
 ] 

Prakash Khemani commented on HBASE-5606:


The getDataSetWatch() call in the timeout-monitor is only being done to check 
whether the znode still exists or not. If there is a failure in getting to the 
znode then we should ignore that failure.

How about implementing the following

in timeoutmonitor 
call getDataSetWatch() only if task has not already failed. (This is just an 
optimization and it can be done without any locking)

for this particular getDataSetWatch() call, store a IGNORE-ZK-ERROR flag in the 
zk async context. If a zk error happens silently then do nothing.



 SplitLogManger async delete node hangs log splitting when ZK connection is 
 lost 
 

 Key: HBASE-5606
 URL: https://issues.apache.org/jira/browse/HBASE-5606
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 0.92.0
Reporter: Gopinathan A
Priority: Critical
 Fix For: 0.92.2

 Attachments: 5606.txt


 1. One rs died, the servershutdownhandler found it out and started the 
 distributed log splitting;
 2. All tasks are failed due to ZK connection lost, so the all the tasks were 
 deleted asynchronously;
 3. Servershutdownhandler retried the log splitting;
 4. The asynchronously deletion in step 2 finally happened for new task
 5. This made the SplitLogManger in hanging state.
 This leads to .META. region not assigened for long time
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(55413,79):2012-03-14 
 19:28:47,932 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89303,79):2012-03-14 
 19:34:32,387 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: put up 
 splitlog task at znode 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}
 {noformat}
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(80417,99):2012-03-14 
 19:34:31,196 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 hbase-root-master-HOST-192-168-47-204.log.2012-03-14(89456,99):2012-03-14 
 19:34:32,497 DEBUG 
 org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback: deleted 
 /hbase/splitlog/hdfs%3A%2F%2F192.168.47.205%3A9000%2Fhbase%2F.logs%2Flinux-114.site%2C60020%2C1331720381665-splitting%2Flinux-114.site%252C60020%252C1331720381665.1331752316170
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5528) Retry splitting log if failed in the process of ServerShutdownHandler, and abort master when retries exhausted

2012-03-06 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13223475#comment-13223475
 ] 

Prakash Khemani commented on HBASE-5528:


I think the log-splitting retry logic is there in ServerShutdownHandler ...

In ServerShutdownHandler.process() ... the handler is requeued in case of error

code
 try {
if (this.shouldSplitHlog) {
  LOG.info(Splitting logs for  + serverName);
  this.services.getMasterFileSystem().splitLog(serverName);
} else {
  LOG.info(Skipping log splitting for  + serverName);
}
  } catch (IOException ioe) {
this.services.getExecutorService().submit(this);
this.deadServers.add(serverName);
throw new IOException(failed log splitting for  +
  serverName + , will retry, ioe);
  }
code

 Retry splitting log if failed in the process of ServerShutdownHandler, and 
 abort master when retries exhausted
 --

 Key: HBASE-5528
 URL: https://issues.apache.org/jira/browse/HBASE-5528
 Project: HBase
  Issue Type: Bug
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: hbase-5528.patch, hbase-5528v2.patch


 We will retry splitting log if failed in splitLogAfterStartup when master 
 starts.
 However, there is no retry for failed splitting log in the process of 
 ServerShutdownHandler.
 Also, if we finally failed to split log, we should abort master even if 
 filesystem is ok to prevent data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-29 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13219413#comment-13219413
]

Prakash Khemani commented on HBASE-5270:

@Stack

If we presume that the list of servers that joinClusters received contains
the server hosting .META., then the next step, that you outlined in your
scenario, cannot be allowed. If we are splitting logs for .META. then we
have determined that meta-server was not running and therefore it cannot
be taking edits. The problem you are outlining is probably still there but
the scenario has to be refined.

Anyway my point was - at startup master should determine once what servers
are up and what are not. This should include whether ROOT and META are
assigned or not. And then it should initialize everything based on that
knowledge which must not change during initialization. Anything that
changes during initialization should be taken care of by the normal
Server-handlers. But I have to admit, I don't understand the assignment
complexities very well Š I will read up some more.

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

Attachments: 5270-90-testcase.patch, 5270-90-testcasev2.patch,
5270-90.patch, 5270-90v2.patch, 5270-90v3.patch, 5270-testcase.patch,
5270-testcasev2.patch, hbase-5270.patch, hbase-5270v2.patch,
hbase-5270v4.patch, hbase-5270v5.patch, hbase-5270v6.patch,
hbase-5270v7.patch, hbase-5270v8.patch, sampletest.txt

This JIRA continues the effort from HBASE-5179. Starting with Stack's
comments about patches for 0.92 and TRUNK:
Reviewing 0.92v17
isDeadServerInProgress is a new public method in ServerManager but it does
not seem to be used anywhere.
Does isDeadRootServerInProgress need to be public? Ditto for meta version.
This method param names are not right 'definitiveRootServer'; what is meant
by definitive? Do they need this qualifier?
Is there anything in place to stop us expiring a server twice if its carrying
root and meta?
What is difference between asking assignment manager isCarryingRoot and this
variable that is passed in? Should be doc'd at least. Ditto for meta.
I think I've asked for this a few times - onlineServers needs to be
explained... either in javadoc or in comment. This is the param passed into
joinCluster. How does it arise? I think I know but am unsure. God love the
poor noob that comes awandering this code trying to make sense of it all.
It looks like we get the list by trawling zk for regionserver znodes that
have not checked in. Don't we do this operation earlier in master setup? Are
we doing it again here?
Though distributed split log is configured, we will do in master single
process splitting under some conditions with this patch. Its not explained in
code why we would do this. Why do we think master log splitting 'high
priority' when it could very well be slower. Should we only go this route if
distributed splitting is not going on. Do we know if concurrent distributed
log splitting and master splitting works?
Why would we have dead servers in progress here in master startup? Because a
servershutdownhandler fired?
This patch is different to the patch for 0.90. Should go into trunk first
with tests, then 0.92. Should it be in this issue? This issue is really hard
to follow now. Maybe this issue is for 0.90.x and new issue for more work on
this trunk patch?
This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-28 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13218407#comment-13218407
]

Prakash Khemani commented on HBASE-5270:

Assuming that the master uses the saved region-server list in joinCluster,
can you then please outline the scenario where problems can still happen?
There is some handling of META and ROOT not being available in
ServerShutdownHandler and I am wondering why that is not sufficient.

On 2/27/12 11:17 PM, chunhui shen (Commented) (JIRA) j...@apache.org

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-02-27 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13217968#comment-13217968
]

Prakash Khemani commented on HBASE-5270:

(I haven't read through the comments carefully and I am sorry for the noise if
I am way off the mark)

The problem as I see it is that the Master's understanding of which region
servers are online changes from the time that it calls splitLogAfterStartup()
to the time it calls rebuildUserRegions() in joinCluster().

I feel that it might be lot simpler if master saves the list of region-servers
that it had given to splitLogAfterStartup(), and later uses the same list for
rebuilding user regions. That should fix this issue, won't it?

Handle potential data loss due to concurrent processing of processFaileOver
and ServerShutdownHandler
-

Key: HBASE-5270
URL: https://issues.apache.org/jira/browse/HBASE-5270
Project: HBase
Issue Type: Sub-task
Components: master
Reporter: Zhihong Yu
Assignee: chunhui shen
Fix For: 0.92.1, 0.94.0

[jira] [Commented] (HBASE-4932) Block cache can be mistakenly instantiated by tools

2012-02-24 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215947#comment-13215947
 ] 

Prakash Khemani commented on HBASE-4932:


Yes ... It is a good to have patch. Thanks.

On 2/24/12 12:03 PM, Mikhail Bautin (Commented) (JIRA) j...@apache.org




 Block cache can be mistakenly instantiated by tools
 ---

 Key: HBASE-4932
 URL: https://issues.apache.org/jira/browse/HBASE-4932
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani
Assignee: Prakash Khemani
 Fix For: 0.94.0

 Attachments: HBASE-4932.patch


 Map Reduce tasks that create a writer to write HFiles inadvertently end up 
 creating block cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

2012-02-23 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215085#comment-13215085
]

Prakash Khemani commented on HBASE-5347:

Lars, you are right. I have been trying to induce a Full GC but without
any success. (I can induce a full GC if I artificially hold some
key-values in queue and force them to be tenured.)

On 89-fb, my test-case is doing random increments on a space of slightly
more than 40GB worth of Key-value data. The heap is set to 36GB. The LRU
cache has a high and low watermark of .98 and .85 percents. The region
server spawns 1000 threads that continuously do the increments. The
eviction thread manages to keep the block-cache at about 85% always.
Cache-on-write is turned on to induce more cache churn. All the 12 disks
are close to 100% read pegged. GC takes 60% of the CPU (sum of user times
in 1000 lines of gc log / (elapsed time * #cpus)). Compactions that get
started never complete while the load is on.

I guess I have to change the dynamics of the test case to induce GC pauses.

On 2/22/12 11:35 PM, Todd Lipcon (Commented) (JIRA) j...@apache.org
wrote:

GC free memory management in Level-1 Block Cache

Key: HBASE-5347
URL: https://issues.apache.org/jira/browse/HBASE-5347
Project: HBase
Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani
Attachments: D1635.5.patch

On eviction of a block from the block-cache, instead of waiting for the
garbage collecter to reuse its memory, reuse the block right away.
This will require us to keep reference counts on the HFile blocks. Once we
have the reference counts in place we can do our own simple
blocks-out-of-slab allocation for the block-cache.
This will help us with
* reducing gc pressure, especially in the old generation
* making it possible to have non-java-heap memory backing the HFile blocks

[jira] [Commented] (HBASE-5332) Deterministic Compaction Jitter

2012-02-21 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13212858#comment-13212858
 ] 

Prakash Khemani commented on HBASE-5332:


The major compactions are jittered so that too many of them don't happen at the 
same time. Rather than relying on random jitter, why can't the compaction 
thread simply ensure that it doesn't schedule too many compactions at the same 
time?

 Deterministic Compaction Jitter
 ---

 Key: HBASE-5332
 URL: https://issues.apache.org/jira/browse/HBASE-5332
 Project: HBase
  Issue Type: Improvement
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
Priority: Minor
 Attachments: D1785.1.patch, D1785.2.patch, D1785.3.patch


 Currently, we add jitter to a compaction using delay + jitter*(1 - 
 2*Math.random()).  Since this is non-deterministic, we can get major 
 compaction storms on server restart as half the Stores that were set to 
 delay + jitter will now be set to delay - jitter.  We need a more 
 deterministic way to jitter major compactions so this information can persist 
 across server restarts.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

2012-02-10 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205647#comment-13205647
 ] 

Prakash Khemani commented on HBASE-5347:


another advantage of this approach will be that we will be able to get rid of 
low/high water marks in LRUBlockCache and make block eviction synchronous with 
demand. The default value of the watermarks is set to 75%  and 85% (in 89). 
That means we waste somewhere around 20% of the block-cache today because of 
asynchronous garbage collection.

 GC free memory management in Level-1 Block Cache
 

 Key: HBASE-5347
 URL: https://issues.apache.org/jira/browse/HBASE-5347
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani

 On eviction of a block from the block-cache, instead of waiting for the 
 garbage collecter to reuse its memory, reuse the block right away.
 This will require us to keep reference counts on the HFile blocks. Once we 
 have the reference counts in place we can do our own simple 
 blocks-out-of-slab allocation for the block-cache.
 This will help us with
 * reducing gc pressure, especially in the old generation
 * making it possible to have non-java-heap memory backing the HFile blocks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

2012-02-08 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203871#comment-13203871
 ] 

Prakash Khemani commented on HBASE-5347:


initial diff for feedback https://reviews.facebook.net/D1635

 GC free memory management in Level-1 Block Cache
 

 Key: HBASE-5347
 URL: https://issues.apache.org/jira/browse/HBASE-5347
 Project: HBase
  Issue Type: Improvement
Reporter: Prakash Khemani
Assignee: Prakash Khemani

 On eviction of a block from the block-cache, instead of waiting for the 
 garbage collecter to reuse its memory, reuse the block right away.
 This will require us to keep reference counts on the HFile blocks. Once we 
 have the reference counts in place we can do our own simple 
 blocks-out-of-slab allocation for the block-cache.
 This will help us with
 * reducing gc pressure, especially in the old generation
 * making it possible to have non-java-heap memory backing the HFile blocks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression

2012-02-08 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13203918#comment-13203918
]

Prakash Khemani commented on HBASE-5313:

The values can be kept compressed in memory. We can uncompress them on
demand when writing out the key-values during rpc or compactions.

The key has to have a pointer to the values. The pointer can be implicit
and can be derived from value lengths if all the values are stored in the
same order as keys.

The value pointer has to be explicit if the values are stored in a
different order than the keys. We might want to write out the values in a
different order if we want to do per column compression. While writing out
the HFileBlock the following can be done - group all the values by their
column identifier, independently compress and write out each group of
values, go back to the keys and update the value pointers.

On 2/8/12 11:50 AM, Lars Hofhansl (Commented) (JIRA) j...@apache.org

Restructure hfiles layout for better compression

Key: HBASE-5313
URL: https://issues.apache.org/jira/browse/HBASE-5313
Project: HBase
Issue Type: Improvement
Components: io
Reporter: dhruba borthakur
Assignee: dhruba borthakur

A HFile block contain a stream of key-values. Can we can organize these kvs
on the disk in a better way so that we get much greater compression ratios?
One option (thanks Prakash) is to store all the keys in the beginning of the
block (let's call this the key-section) and then store all their
corresponding values towards the end of the block. This will allow us to
not-even decompress the values when we are scanning and skipping over rows in
the block.
Any other ideas?

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2012-01-26 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194454#comment-13194454
]

Prakash Khemani commented on HBASE-5010:

This change is doesn't break HBASE-4721.

HBASE-4721 introduced another parameter called
hbase.hstore.time.to.purge.deletes to keep deletes even after major
compactions. But hbase.hstore.time.to.purge.deletes doesn't override the TTL of
the store.

Pasting the comment from code which hopefully makes it clear that this diff
works with HBASE-4721

// By default, when hbase.hstore.time.to.purge.deletes is 0ms, a delete
// marker is always removed during a major compaction. If set to non-zero
// value then major compaction will try to keep a delete marker around for
// the given number of milliseconds. We want to keep the delete markers
// around a bit longer because old puts might appear out-of-order. For
// example, during log replication between two clusters.
//
// If the delete marker has lived longer than its column-family's TTL then
// the delete marker will be removed even if time.to.purge.deletes has not
// passed. This is because all the Puts that this delete marker can influence
// would have also expired. (Removing of delete markers on col family TTL will
// not happen if min-versions is set to non-zero)
//

Filter HFiles based on TTL
--

Key: HBASE-5010
URL: https://issues.apache.org/jira/browse/HBASE-5010
Project: HBase
Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Fix For: 0.94.0

Attachments: 5010.patch, D1017.1.patch, D1017.2.patch, D909.1.patch,
D909.2.patch, D909.3.patch, D909.4.patch, D909.5.patch, D909.6.patch

In ScanWildcardColumnTracker we have
{code:java}

this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
...
private boolean isExpired(long timestamp) {
return timestamp oldestStamp;
}
{code}
but this time range filtering does not participate in HFile selection. In one
real case this caused next() calls to time out because all KVs in a table got
expired, but next() had to iterate over the whole table to find that out. We
should be able to filter out those HFiles right away. I think a reasonable
approach is to add a default timerange filter to every scan for a CF with a
finite TTL and utilize existing filtering in
StoreFile.Reader.passesTimerangeFilter.

[jira] [Commented] (HBASE-5136) Redundant MonitoredTask instances in case of distributed log splitting retry

2012-01-06 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181777#comment-13181777
 ] 

Prakash Khemani commented on HBASE-5136:


it will be lot simpler to do status.cleanup() in the finally block in 
splitLogDistributed()

 Redundant MonitoredTask instances in case of distributed log splitting retry
 

 Key: HBASE-5136
 URL: https://issues.apache.org/jira/browse/HBASE-5136
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
Assignee: Zhihong Yu
 Attachments: 5136.txt


 In case of log splitting retry, the following code would be executed multiple 
 times:
 {code}
   public long splitLogDistributed(final ListPath logDirs) throws 
 IOException {
 MonitoredTask status = TaskMonitor.get().createStatus(
   Doing distributed log split in  + logDirs);
 {code}
 leading to multiple MonitoredTask instances.
 User may get confused by multiple distributed log splitting entries for the 
 same region server on master UI

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

2012-01-04 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179891#comment-13179891
]

Prakash Khemani commented on HBASE-5081:

The retry logic is in HMaster.splitLogAfterStartup(). I will remove the
OrphanLogException handling from MasterFileSystem.

On 1/4/12 12:03 PM, Zhihong Yu (Commented) (JIRA) j...@apache.org

Distributed log splitting deleteNode races against splitLog retry
--

Key: HBASE-5081
URL: https://issues.apache.org/jira/browse/HBASE-5081
Project: HBase
Issue Type: Bug
Components: wal
Affects Versions: 0.92.0, 0.94.0
Reporter: Jimmy Xiang
Assignee: Prakash Khemani
Fix For: 0.92.0

Attachments:
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt,
hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt,
hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt,
patch_for_92_v3.txt

Recently, during 0.92 rc testing, we found distributed log splitting hangs
there forever. Please see attached screen shot.
I looked into it and here is what happened I think:
1. One rs died, the servershutdownhandler found it out and started the
distributed log splitting;
2. All three tasks failed, so the three tasks were deleted, asynchronously;
3. Servershutdownhandler retried the log splitting;
4. During the retrial, it created these three tasks again, and put them in a
hashmap (tasks);
5. The asynchronously deletion in step 2 finally happened for one task, in
the callback, it removed one
task in the hashmap;
6. One of the newly submitted tasks' zookeeper watcher found out that task is
unassigned, and it is not
in the hashmap, so it created a new orphan task.
7. All three tasks failed, but that task created in step 6 is an orphan so
the batch.err counter was one short,
so the log splitting hangs there and keeps waiting for the last task to
finish which is never going to happen.
So I think the problem is step 2. The fix is to make deletion sync, instead
of async, so that the retry will have
a clean start.
Async deleteNode will mess up with split log retrial. In extreme situation,
if async deleteNode doesn't happen
soon enough, some node created during the retrial could be deleted.
deleteNode should be sync.

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

2012-01-04 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179894#comment-13179894
]

Prakash Khemani commented on HBASE-5081:

Will look into the test failure. I am not sure I know where to find the
test run's output logs.

On 1/4/12 12:35 PM, Zhihong Yu (Commented) (JIRA) j...@apache.org

Distributed log splitting deleteNode races against splitLog retry
--

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

2012-01-04 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179926#comment-13179926
]

Prakash Khemani commented on HBASE-5081:

If there is a spurious wakeup before the status has changed to DELETE then
the code will return error (oldtask) to the caller.

Regarding the hung TestSplitLogManager test in
https://builds.apache.org/job/PreCommit-HBASE-Build/665/console - I
couldn't find what failed or what hung.
https://builds.apache.org/job/PreCommit-HBASE-Build/665//testReport/org.apa
che.hadoop.hbase.master/TestSplitLogManager/ shows that everything passed.

Distributed log splitting deleteNode races against splitLog retry
--

Attachments:
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt,
hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt,
hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt,
patch_for_92_v3.txt

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

2012-01-04 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180167#comment-13180167
]

Prakash Khemani commented on HBASE-5081:

{code}
+while (oldtask.status == FAILURE) {
+ // wait for status to change to DELETED
+ try {
+oldtask.wait();
+ } catch (InterruptedException e) {
+Thread.currentThread().interrupt();
+LOG.warn(Interrupted when waiting for znode delete
callback);
+// fall through to return failure
}
- oldtask.setBatch(batch);
}

{code}

Changing the 'if' to 'while' is OK. But in case of interruptedexception
you should exit the while loop and fall through and return.

If you don't return on interrupt then there is a good possibility of
deadlock when the process is trying to exit.

Also there is no point calling oldtask.wait() with the thread's interrupt
set. It will immediately throw InterruptedException again.

Distributed log splitting deleteNode races against splitLog retry
--

Attachments:
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
5081-deleteNode-with-while-loop.txt,
distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt,
hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt,
hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt,
patch_for_92_v3.txt

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

2012-01-04 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180206#comment-13180206
]

Prakash Khemani commented on HBASE-5081:

Latest patch uploaded by Ted looks good. I will try to develop a test case
for the delayed delete handling.

On 1/4/12 9:44 PM, Lars Hofhansl (Commented) (JIRA) j...@apache.org

Distributed log splitting deleteNode races against splitLog retry
--

Attachments:
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
5081-deleteNode-with-while-loop.txt,
distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt,
hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt,
hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt,
patch_for_92_v3.txt

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

2012-01-03 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179320#comment-13179320
]

Prakash Khemani commented on HBASE-5081:

Assuming splitlog failed, the delete of the zk-task-node is queued up,
splitlog is retried, createTaskIfAbsent is called, the following piece of
code in createTaskIfAbsent() will be hit (because oldtask status is
neither IN_PROGRESS nor SUCCESS. Oldtask status is FAILED)

LOG.warn(Transient problem. Failure because previously failed task +
state still present. Waiting for znode delete callback +
path= + path);
return oldtask;

The splitlog retry will fail immediately with IOException(duplicate log
split scheduled for ). The caller (master) will wait and retry again.

On 1/3/12 7:02 PM, Jimmy Xiang (Commented) (JIRA) j...@apache.org

Distributed log splitting deleteNode races against splitLog retry
--

Attachments:
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
0001-HBASE-5081-jira-Distributed-log-splitting-deleteNode.patch,
distributed-log-splitting-screenshot.png, hbase-5081-patch-v6.txt,
hbase-5081-patch-v7.txt, hbase-5081_patch_for_92_v4.txt,
hbase-5081_patch_v5.txt, patch_for_92.txt, patch_for_92_v2.txt,
patch_for_92_v3.txt

[jira] [Commented] (HBASE-5029) TestDistributedLogSplitting fails on occasion

2011-12-19 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172613#comment-13172613
]

Prakash Khemani commented on HBASE-5029:

I have been running this test in a loop on my laptop for a while. But I
haven't been able to reproduce it.

Looked at the code and could not figure out why the test will fail. The
test can fail if for some reason the task that has been put up in
zookeeper doesn't get acquired for 30 seconds. It will be easier to fix
this if I had all the logs.

We have been running distributed logging in production for quite some
time. And yes, I have tested a region server aborting when it is executing
a task a number of times.

TestDistributedLogSplitting fails on occasion
-

Key: HBASE-5029
URL: https://issues.apache.org/jira/browse/HBASE-5029
Project: HBase
Issue Type: Bug
Reporter: stack
Assignee: Prakash Khemani
Priority: Critical
Attachments:
0001-HBASE-5029-jira-TestDistributedLogSplitting-fails-on.patch,
5029-addingignore.txt, HBASE-5029.D891.1.patch, HBASE-5029.D891.2.patch

This is how it usually fails:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testWorkerAbort/
Assigning mighty Prakash since he offered to take a looksee.

[jira] [Commented] (HBASE-5029) TestDistributedLogSplitting fails on occasion

2011-12-19 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172677#comment-13172677
 ] 

Prakash Khemani commented on HBASE-5029:


The cause for this error appears to be the following DFSClient exception

2011-12-17 01:14:48,369 ERROR
[SplitLogWorker-janus.apache.org,53708,1324084461889]
regionserver.SplitLogWorker(169): unexpected error
java.lang.NullPointerException at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeThreads(DFSClient.jav
a:3831) at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.ja
va:3874) at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3809)
 at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStr
eam.java:61) at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1017)
at 
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.close(Sequen
ceFileLogWriter.java:214) at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HL
ogSplitter.java:458) at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFileToTemp(HL
ogSplitter.java:351) at
org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.j
ava:113) at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker
.java:266) at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker
.java:197) at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java
:165) at java.lang.Thread.run(Thread.java:662)




(Much earlier in the logs, while the SplitLogWorker was trying to
recover-lease on the log file it had received a Thread.interrupt() because
the region server was exiting. That thread.interrupt() was unsuccessful in
interrupting the recoverLease() call. It is also possible that the
interrupt was eaten up during the recoverLease() call.) the
split-log-worker thread continued to split the log file. It successfully
split the file ... But in the end it hit this exception. It is possible
that the file-system was closed by the time the above exception happened.)

The fix probably requires some more checking in
DFSClient$DFSOutputStream.closeInternal() for a closed file system.

The more difficult task is to make sure that recoverLease() handles
interrupts correctly.

On 12/14/11 2:33 PM, Zhihong Yu (Commented) (JIRA) j...@apache.org




 TestDistributedLogSplitting fails on occasion
 -

 Key: HBASE-5029
 URL: https://issues.apache.org/jira/browse/HBASE-5029
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Prakash Khemani
Priority: Critical
 Attachments: 
 0001-HBASE-5029-jira-TestDistributedLogSplitting-fails-on.patch, 
 5029-addingignore.txt, HBASE-5029.D891.1.patch, HBASE-5029.D891.2.patch


 This is how it usually fails: 
 https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testWorkerAbort/
 Assigning mighty Prakash since he offered to take a looksee.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5013) NPE in HBaseClient$Connection.receiveResponse

2011-12-12 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168168#comment-13168168
 ] 

Prakash Khemani commented on HBASE-5013:



It could be HBASE-4980. HBASE-4980 is not synced-into the internal fb
branch. My analysis could be wring because I might be trying to match the
stack trace against the wrong build. I will cross check.






 NPE in HBaseClient$Connection.receiveResponse
 -

 Key: HBASE-5013
 URL: https://issues.apache.org/jira/browse/HBASE-5013
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani
Assignee: Prakash Khemani

 We have the following NPE
 java.io.IOException: Call to hbasedev003.snc3.facebook.com/10.26.1.198:60020 
 failed on local exception: java.io.IOException: Unexpected exception 
 receiving call responses
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:916)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:885)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:149)
   at $Proxy6.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine.getProxy(WritableRpcEngine.java:182)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:295)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:272)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:324)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:228)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1197)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1154)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1141)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:872)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:768)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:742)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:978)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:772)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:736)
   at org.apache.hadoop.hbase.client.HTable.(HTable.java:207)
   at org.apache.hadoop.hbase.client.HTable.(HTable.java:177)
   at com.facebook.BulkImporter.VerifyAssocs.(VerifyAssocs.java:248)
   at 
 com.facebook.BulkImporter.VerifyAssocs$AssocVerifierMapper.setup(VerifyAssocs.java:138)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:624)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
   at org.apache.hadoop.mapred.Child.main(Child.java:159)
 Caused by: java.io.IOException: Unexpected exception receiving call responses
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:494)
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:571)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:490)
 ===
 Just by looking at code the NPE shouldn't have happened
 HBaseClient$Connection.setUpIOstreams() sets up in and out.
 Then it starts the Connection thread.
 The Connection in its run method calls receiveResponse()
 In receiveResponse() NPE happens in 
 int id = in.readInt();
 As per java.util.concurrent docs the the initialization of in should have 
 been visible in the Connection thread's run() method. So I don't know how in 
 ended up being NULL.
 ===
 While looking into this issue I noticed a small problem in the 
 closeConnection() method. I will soon upload a diff.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4987) wrong use of incarnation var in SplitLogManager

2011-12-08 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165766#comment-13165766
 ] 

Prakash Khemani commented on HBASE-4987:


Old issue HBASE-4855

 wrong use of incarnation var in SplitLogManager
 ---

 Key: HBASE-4987
 URL: https://issues.apache.org/jira/browse/HBASE-4987
 Project: HBase
  Issue Type: Bug
Reporter: Prakash Khemani
Assignee: Prakash Khemani

 @Ramakrishna found and analyzed an issue in SplitLogManager. But I don't 
 think that the fix is correct. Will upload a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4823) long running scans lose benefit of bloomfilters and timerange hints

2011-11-19 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153655#comment-13153655
 ] 

Prakash Khemani commented on HBASE-4823:


https://issues.apache.org/jira/browse/HBASE-3415 is also related

 long running scans lose benefit of bloomfilters and timerange hints
 ---

 Key: HBASE-4823
 URL: https://issues.apache.org/jira/browse/HBASE-4823
 Project: HBase
  Issue Type: Bug
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 When you have a long running scan due to say an MR job, you can lose the 
 benefit of timerange hints  bloom filters midway if your scanner gets reset. 
 [Note: The scanners can get reset say due to a flush or compaction].
 In one of our workloads, we periodically want to do rollups on recent 15 
 minutes of data in a column family... but the timerange hint benefit is lost 
 midway when this resetScannerStack (shown below) happens. And end result-- we 
 end up reading all the old HFiles rather than just the recent HFiles.
 {code}
  private void resetScannerStack(KeyValue lastTopKey) throws IOException {
 if (heap != null) {
   throw new RuntimeException(StoreScanner.reseek run on an existing 
 heap!);
 }
 /* When we have the scan object, should we not pass it to getScanners()
  * to get a limited set of scanners? We did so in the constructor and we
  * could have done it now by storing the scan object from the constructor 
 */
 ListKeyValueScanner scanners = getScanners();
 {code}
 The comment in the code seems to be aware of this issue and even has the 
 suggested fix!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4721) Retain Delete Markers after Major Compaction

2011-11-07 Thread Prakash Khemani (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145785#comment-13145785
]

Prakash Khemani commented on HBASE-4721:

I had started at a point where I thought I will independently assign ttls to
delete markers. But now I have realized that it doesn't make any sense to give
a different ttl to the delete-markers. (giving the delete-markers a smaller ttl
than the puts will be incorrect. giving them a larger ttl than the puts will be
pointless because then the delete-markers will be deleting already expired puts)

HBASE-4536 will work but only if keep-deleted-kvs flag is set on the column
family (or is it table?). Do you think it makes sense to make it the default
behavior that regardless of whether point-in-time queries are being supported
or not, major compaction will not remove the delete-markers? A delete-marker
will only be removed when it expires or when enough put versions accumulate
before it.

Concerns that people have raised if we stopped removing all delete markers in a
major compaction
(1) Space wastage. I am not sure if this is a big concern.
(2) The bigger issue is that the user will never be able to insert a Put beyond
the delete marker. Today, if the user makes a mistake then the admin can go in,
delete the puts, do a major compaction, and then the user can reinsert the
correct Puts. This workflow will be nullified if we keep delete-markers even
after major compaction.
(3) Today the user doesn't even know that there are delete markers. But that
will have to change if we start keeping delete-markers beyond major compactions.

===
I don't get the reasoning behind why we need to keep deleted puts when syncing
logs from one cluster to another. The problem that I am concerned about is the
following

(1) Delete marker arrives from the source cluster
(2) major compaction happens on the target cluster which gets rid of the delete
marker
(3) The deleted put arrives from the source cluster. Now that the delete marker
is not there, this put will become visible on the target cluster.

Retain Delete Markers after Major Compaction

Key: HBASE-4721
URL: https://issues.apache.org/jira/browse/HBASE-4721
Project: HBase
Issue Type: New Feature
Reporter: Prakash Khemani
Assignee: Prakash Khemani

There is a need to provide long TTLs for delete markers. This is useful when
replicating hbase logs from one cluster to another. The receiving cluster
shouldn't compact away the delete markers because the affected key-values
might still be on the way.

[jira] [Commented] (HBASE-4674) splitLog silently fails

2011-10-30 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139857#comment-13139857
 ] 

Prakash Khemani commented on HBASE-4674:


Stack, it is pretty obvious in the code. And yes, I have had seen lost
edits a number of times.

A simple way to reproduce this issue

Create a table
Kill namenode. That kills all region servers.
Master doesn't die.
Master tries to split logs and fails. But it ignores the failure and moves
on to assign regions.
Start namenode.
Start regionservers
The regions get assigned w/o their logs getting replayed.

==

BTW, the fix to this is being posted by Nicolas in HBASE-2312







 splitLog silently fails
 ---

 Key: HBASE-4674
 URL: https://issues.apache.org/jira/browse/HBASE-4674
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
 Environment: splitLog() can fail silently and region can open w/o its 
 edits getting replayed.
Reporter: Prakash Khemani
Assignee: Prakash Khemani
Priority: Blocker



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4696) HRegionThriftServer' might have to indefinitely do redirtects

2011-10-28 Thread Prakash Khemani (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138891#comment-13138891
 ] 

Prakash Khemani commented on HBASE-4696:


looks good to me. thanks.

 HRegionThriftServer' might have to indefinitely do redirtects
 -

 Key: HBASE-4696
 URL: https://issues.apache.org/jira/browse/HBASE-4696
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0
Reporter: Prakash Khemani
Assignee: Jonathan Gray
 Fix For: 0.94.0

 Attachments: HBASE-4696-v1.patch


 HRegionThriftServer.getRowWithColumnsTs() redirects the request to the 
 correct region server if it has landed on the wrong region-server. With this 
 approach the smart-client will never get a NotServingRegionException and it 
 will never be able to invalidate its cache. It will indefinitely send the 
 request to the wrong region server and the wrong region server will always be 
 redirecting it.
 Either redirects should be turned off and the client should react to 
 NotServingRegionExceptions.
 Or somehow a flag should be set in the response telling the client to refresh 
 its cache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5618) SplitLogManager - prevent unnecessary attempts to resubmits

[jira] [Commented] (HBASE-5618) SplitLogManager - prevent unnecessary attempts to resubmits

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

[jira] [Commented] (HBASE-5606) SplitLogManger async delete node hangs log splitting when ZK connection is lost

[jira] [Commented] (HBASE-5528) Retry splitting log if failed in the process of ServerShutdownHandler, and abort master when retries exhausted

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

[jira] [Commented] (HBASE-4932) Block cache can be mistakenly instantiated by tools

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

[jira] [Commented] (HBASE-5332) Deterministic Compaction Jitter

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

[jira] [Commented] (HBASE-5347) GC free memory management in Level-1 Block Cache

[jira] [Commented] (HBASE-5313) Restructure hfiles layout for better compression

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

[jira] [Commented] (HBASE-5136) Redundant MonitoredTask instances in case of distributed log splitting retry

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

[jira] [Commented] (HBASE-5081) Distributed log splitting deleteNode races against splitLog retry

[jira] [Commented] (HBASE-5029) TestDistributedLogSplitting fails on occasion

[jira] [Commented] (HBASE-5029) TestDistributedLogSplitting fails on occasion

[jira] [Commented] (HBASE-5013) NPE in HBaseClient$Connection.receiveResponse

[jira] [Commented] (HBASE-4987) wrong use of incarnation var in SplitLogManager

[jira] [Commented] (HBASE-4823) long running scans lose benefit of bloomfilters and timerange hints

[jira] [Commented] (HBASE-4721) Retain Delete Markers after Major Compaction

[jira] [Commented] (HBASE-4674) splitLog silently fails

[jira] [Commented] (HBASE-4696) HRegionThriftServer' might have to indefinitely do redirtects

32 matches

Site Navigation

Mail list logo

Footer information