[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185075#comment-13185075 ] Zhihong Yu commented on HBASE-5179: --- For Jinchao's patch v3: I think it makes sense. Let us know the result of verification. {code} + boolean isProcessingServer(HServerAddress address) { +if (serverManager.getDeadServersBeingProcessed() == null) { + return false; +} {code} A better name for the method would be isDeadServerBeingProcessed(). {code} -rit = this.assignmentManager. - processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.FIRST_META_REGIONINFO); +rit = this.assignmentManager + .processRegionInTransitionAndBlockUntilAssigned(HRegionInfo.FIRST_META_REGIONINFO); {code} There is no material change above. We'd better remove it. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, > 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185039#comment-13185039 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510381/5179-90v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/746//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, > 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185020#comment-13185020 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510379/5179-90v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/745//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch, > 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184956#comment-13184956 ] Zhihong Yu commented on HBASE-5179: --- Since Ram is busy with releasing 0.90.6, I think we should check in the patch for this issue. Hbase-4748 can be tackled in its own Jira. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184843#comment-13184843 ] Zhihong Yu commented on HBASE-5179: --- @Ram: I don't see patch attached to hbase-4748. Are you going to provide a combined patch ? > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184772#comment-13184772 ] stack commented on HBASE-5179: -- Sure. Do what you fellas think best. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184764#comment-13184764 ] chunhui shen commented on HBASE-5179: - I think so too > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184760#comment-13184760 ] Zhihong Yu commented on HBASE-5179: --- You mean hbase-4748, right ? I think we should combine the two. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184755#comment-13184755 ] ramkrishna.s.vasudevan commented on HBASE-5179: --- @Ted, @Stack @Chunhui I think we may have to combine the change in HBASE-4879 as Chunhui suggested 12/Jan/12 03:23. Is it ok to combine it? Because only then the processFailOver and SSH problem can be solved totally. Pls suggest. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184754#comment-13184754 ] stack commented on HBASE-5179: -- I think getDeadServersInProgress is better than getDeadServersBeingProcessed since it relates to areDeadServersInProgress (I can fix this on commit -- would also change name of the Collection in DeadServers so its inProgress). Yeah, would be interested in notion that we do this server checking inside in ServerManager so when you ask for onlineServers, this stuff has been done for you already... or is thought that ServerManager need not know about 'handlers' that HMaster only should have to know whats running under it (A ServerManager and handlers such as ServerShutdownHandler). > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184734#comment-13184734 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510311/5179-90v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/744//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-90v2.patch, 5179-v2.txt, 5179-v3.txt, > 5179-v4.txt, hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184703#comment-13184703 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510304/hbase-5179v5.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -147 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.regionserver.wal.TestHLog org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/741//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/741//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/741//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, > hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184697#comment-13184697 ] Zhihong Yu commented on HBASE-5179: --- {code} + * Class to hold dead servers list, utility querying dead server list and being + * processed dead servers by the ServerShutdownHandler. {code} The above should read 'querying dead server list and the dead servers being processed by ...'. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, > hbase-5179.patch, hbase-5179v5.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184668#comment-13184668 ] chunhui shen commented on HBASE-5179: - I agree with the renaming in patchV4. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, > hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184612#comment-13184612 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510277/5179-v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -147 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 79 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/737//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/737//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/737//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, > hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184536#comment-13184536 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510266/5179-v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -147 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 80 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/735//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/735//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/735//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184533#comment-13184533 ] stack commented on HBASE-5179: -- bq. I think the reason Chunhui introduced a new Set for the dead servers being processed is that DeadServer is supposed to remember dead servers Yeah, I seem to remember such a need but I'd think we should doc' it up some more in DeadServer so next person in here looking at code has a chance figuring whats up. On v3: {code} getDeadServersUnderProcessing {code} is still public and I think it should be named getDeadServersBeingProcessed ... or BeingHandled... or better so it matches areDeadServersInProgress, getDeadServersInProgress.. they are in the process of being made into DeadServers!!! (and there is missing javadoc explaining what this method is at least relative to getDeadServers -- that its servers that are going through ServerShutdownHandler processing). Does this method need to be in the Interface for ServerManager (The less in the Interface the better)? knownServers should be onlineServers which makes me think that this check for DeadServersInProgress should be made inside in ServerManager so that what comes out of getOnlineServers has already had the InProgress servers stripped? Do you think we need that the new Collection deadServersUnderProcessing should instead be called inProgress... and a server is in either inProgress or its in the deadServers list? On remove, it gets moved (under synchronize) from one list to the other. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184450#comment-13184450 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510261/5179-90.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/733//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184448#comment-13184448 ] Zhihong Yu commented on HBASE-5179: --- I think the reason Chunhui introduced a new Set for the dead servers being processed is that DeadServer is supposed to remember dead servers: {code} * Set of known dead servers. On znode expiration, servers are added here. {code} DeadServer.cleanPreviousInstance() is called by ServerManager.checkIsDead() when the server becomes live again. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184303#comment-13184303 ] Zhihong Yu commented on HBASE-5179: --- TestRollingRestart fails in 0.90 with patch. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184296#comment-13184296 ] Zhihong Yu commented on HBASE-5179: --- @Stack: The following code is for 0.90 branch: {code} - } else if (!serverManager.isServerOnline(regionLocation.getServerName())) { + } else if (!onlineServers.contains(regionLocation.getHostname())) { {code} I agree that serversWithoutSplitLog isn't a very good name. It holds both online servers and dead servers. How about naming it knownServers ? ServerManager.java already has: {code} public Set getDeadServers() { return this.deadservers.clone(); } {code} > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184286#comment-13184286 ] stack commented on HBASE-5179: -- I agree with the spirit of this class. Good stuff Chunhui. This is awkward name for a method, getDeadServersUnderProcessing. Should it be getDeadServers? Does it need to be a public method? Seems fine that it be package private. Is serversWithoutSplitLog a good name for a local variable? Should it be deadServers with a comment saying that deadServers are processed by servershutdownhandler and it will be taking care of the log splitting? Is this right -- for trunk? {code} - } else if (!serverManager.isServerOnline(regionLocation.getServerName())) { + } else if (!onlineServers.contains(regionLocation.getHostname())) { Online servers is keyed by a ServerName, not a hostname. What is a deadServersUnderProcessing? Does DeadServers keep list of all servers that ever died? Is that a good idea? Shouldn't finish remove item from deadservers rather than just from deadServersUnderProcessing Change name of this method, cloneProcessingDeadServers. Just call it getDeadServers? That its a clone is an internal implementation detail? > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184287#comment-13184287 ] stack commented on HBASE-5179: -- Its hard to do a test for this? > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184244#comment-13184244 ] Zhihong Yu commented on HBASE-5179: --- I ran the following on MacBook and they passed: {code} 1143 mt -Dtest=TestSplitLogManager 1145 mt -Dtest=TestAdmin#testShouldCloseTheRegionBasedOnTheEncodedRegionName {code} > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184240#comment-13184240 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510215/5179-90.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/732//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184239#comment-13184239 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510206/5179-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -147 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 78 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.client.TestAdmin org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestImportTsv Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/730//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/730//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/730//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-90.txt, 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184227#comment-13184227 ] ramkrishna.s.vasudevan commented on HBASE-5179: --- Patch looks good to me.. Tomorrow will try out in the cluster. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: 5179-v2.txt, hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184181#comment-13184181 ] ramkrishna.s.vasudevan commented on HBASE-5179: --- @Chunhui Can you take a look at HBAE-4748. It is similar to this but there the data loss was w.r.t META leading to more critical data loss. But it is quite rare but still possible. Do you have any suggestions for that? > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184170#comment-13184170 ] ramkrishna.s.vasudevan commented on HBASE-5179: --- @Chunhui Is this issue applicable for 0.90.6? If so can you prepare a patch for 0.90 also? > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184155#comment-13184155 ] Hadoop QA commented on HBASE-5179: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12510164/hbase-5179.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated -147 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 78 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/728//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/728//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/728//console This message is automatically generated. > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184120#comment-13184120 ] Zhihong Yu commented on HBASE-5179: --- {code} + private final Set processingDeadServers = new HashSet(); {code} The field name above sounds like method name. How about naming it deadServersUnderProcessing ? Related method names should be changed as well. {code} + * Called on startup. Figures whether a fresh cluster start of we are joining {code} should read 'start or we are'. For ServerManager.java and DeadServer.java: {code} + public Set getProcessingDeadServers() { +return this.deadservers.cloneProcessingDeadServers(); + } {code} The method should be called cloneDeadServersUnderProcessing(). > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.2 >Reporter: chunhui shen >Assignee: chunhui shen > Attachments: hbase-5179.patch > > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completed splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master starts to assign regions of RegionserverA because it is a dead > server by step3. > However, when doing step4(assigning region), ServerShutdownHandler may be > doing split log, Therefore, it may cause data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region is assigned before completing split log, it would cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183915#comment-13183915 ] chunhui shen commented on HBASE-5179: - Master logs, Let's see the region a04d0ac0a360e8cf5edf74af4ce64b16. {code} 2011-12-30 02:20:05,285 INFO org.apache.hadoop.hbase.master.HMaster: Master startup proceeding: master failover 2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Server start rejected; we already have dw83.kgb.sqa.cm4:60020 registered; existingServer=serverName=dw83.kgb.sqa.cm4,60020,1325180976942, load=(requests=0, regions=7, usedHeap=10831, maxHeap=15872), newServer=serverName=dw83.kgb.sqa.cm4,60020,1325182806080, load=(requests=0, regions=0, usedHeap=230, maxHeap=15872) 2011-12-30 02:20:06,779 INFO org.apache.hadoop.hbase.master.ServerManager: Triggering server recovery; existingServer dw83.kgb.sqa.cm4,60020,1325180976942 looks stale 2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=-ROOT-,,0.70236052 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 2011-12-30 02:20:06,780 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current region=.META.,,1.1028785192 is on server=serverName=dw80.kgb.sqa.cm4,60020,1325180470774, load=(requests=0, regions=0, usedHeap=0, maxHeap=0) server being checked: dw83.kgb.sqa.cm4,60020,1325180976942 2011-12-30 02:20:06,781 DEBUG org.apache.hadoop.hbase.master.ServerManager: Added=dw83.kgb.sqa.cm4,60020,1325180976942 to dead servers, submitted shutdown handler to be executed, root=false, meta=false 2011-12-30 02:20:07,839 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Creating writer path=hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/00965355783.temp region=a04d0ac0a360e8cf5edf74af4ce64b16 2011-12-30 02:20:08,965 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x134784f727b0543 Creating (or updating) unassigned node for a04d0ac0a360e8cf5edf74af4ce64b16 with OFFLINE state 2011-12-30 02:20:08,988 INFO org.apache.hadoop.hbase.master.AssignmentManager: Failed-over master needs to process 14 regions in transition 2011-12-30 02:20:09,017 INFO org.apache.hadoop.hbase.master.AssignmentManager: Processing region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. in state M_ZK_REGION_OFFLINE 2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.handler.ClosedRegionHandler: Handling CLOSED event for a04d0ac0a360e8cf5edf74af4ce64b16 2011-12-30 02:20:09,017 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. state=OFFLINE, ts=1325182808966 2011-12-30 02:20:09,020 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. to dw81.kgb.sqa.cm4,60020,1325181205124 2011-12-30 02:20:09,365 DEBUG org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Opened region writetest,B8VCH6I7EP0SLJA6KU8VTE75RCCZMZ14GTBRSTC7QOW9L2Q818R1O4PLA9ZX64JD5ZZTSAK021NUYUUHJ0BS9NTTCQ09PBRZMZPL,1325179237366.a04d0ac0a360e8cf5edf74af4ce64b16. on dw81.kgb.sqa.cm4,60020,1325181205124 2011-12-30 02:20:20,144 INFO org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path hdfs://dw74.kgb.sqa.cm4:9000/hbase-common/writetest/a04d0ac0a360e8cf5edf74af4ce64b16/recovered.edits/00965355783.temp (wrote 146434 edits in 1761ms) {code} > Concurrent processing of processFaileOver and ServerShutdownHandler may > cause region is assigned before completing split log, it would cause data loss > --- > > Key: HBASE-5179 > URL: https://issues.apache.org/jira/browse/HBASE-5179 > Project: HBase > Issue Type: Bug > Components: master >Reporter: chunhui shen >Assignee: chunhui shen > > If master's processing its failover and ServerShutdownHandler's processing > happen concurrently, it may appear following case. > 1.master completing splitLogAfterStartup() > 2.RegionserverA restarts, and ServerShutdownHandler is processing. > 3.master starts to rebuildUserRegions, and RegionserverA is considered as > dead server. > 4.master