[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293602#comment-13293602 ] Hadoop QA commented on HBASE-6134: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531772/HBASE-6134v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.replication.TestReplication org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2147//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2147//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2147//console This message is automatically generated. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3-92.patch, HBASE-6134v3.patch, HBASE-6134v4.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13294053#comment-13294053 ] Zhihong Ted Yu commented on HBASE-6134: --- TestServerCustomProtocol passes locally. Will integrate later if there is no objection. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: 6134v4.patch, HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3-92.patch, HBASE-6134v3.patch, HBASE-6134v4.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292743#comment-13292743 ] chunhui shen commented on HBASE-6134: - Create new review request https://review.cloudera.org/r/2137/ Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293345#comment-13293345 ] Hadoop QA commented on HBASE-6134: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12531760/HBASE-6134v3-92.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2142//console This message is automatically generated. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3-92.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289178#comment-13289178 ] Anoop Sam John commented on HBASE-6134: --- Yes agree with Chunhui One buffer per HLogSplitter and one HLogSplitter instance per one log file that the RS is splitting. Correct me if I am wrong Chunhui Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289067#comment-13289067 ] chunhui shen commented on HBASE-6134: - In the review board ,Prakash Khemani saied bq.But the old code had a serious drawback – it would read the entire log file in memory before writing it out. Also the old code assumed that multiple log files were being split at the same time, but that is no longer true with distributed log splitting. Whatever approach we take, I don’t think we should re-introduce buffering of the entire log file in memory. Since we set maximal buffer size 128MB, I don't know what effect would cause if using buffer. Anyway, splitting log happens infrequently. what others consider? Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289136#comment-13289136 ] stack commented on HBASE-6134: -- @Chunhui You are saying we only buffer 128M worth of the WAL, not the entire log file as Prakash says (You sure? Do we not read it all up into memory to bucket it by region and then after that is done, then we start writing out the recovered.edits file? I could be wrong -- going by memory). If we are only using a 128M buffer flushing to the recovered.edits file as we go, then that'd be fine I'd say. Are there other improvements to be had given the circumstance is now one reader only writing multiple files? Good on you. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289155#comment-13289155 ] Anoop Sam John commented on HBASE-6134: --- From the patch and current HLogSplitter code I can see that 128M (def value) buffer only is used. From the WAL data is read into this buffer and parallel writers get data from buffer and write to appropriate recovered.edits file corresponding to region. Reading into the buffer and write by the writers done parallely like producer consumer. This way split happens when RS doing split for one file. If the same RS doing split of another file another HLogSplitter instance will be created for that which will contain another buffer. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289162#comment-13289162 ] stack commented on HBASE-6134: -- @Anoop So we flush to recovered.edits how often? If each recovered.edits has a buffer of 16megs and we have 1000 recovered.edits that we have not yet flushed, can that happen? Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289166#comment-13289166 ] chunhui shen commented on HBASE-6134: - @stack It won't happen. bq.So we flush to recovered.edits how often? Writers are flushing data all along Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13289168#comment-13289168 ] chunhui shen commented on HBASE-6134: - Writers#doRun {code} LOG.debug(Writer thread + this + : starting); while (true) { RegionEntryBuffer buffer = entryBuffers.getChunkToWrite(); if (buffer == null) { // No data currently available, wait on some more to show up synchronized (dataAvailable) { if (shouldStop) return; try { dataAvailable.wait(1000); } catch (InterruptedException ie) { if (!shouldStop) { throw new RuntimeException(ie); } } } continue; } assert buffer != null; try { writeBuffer(buffer); } finally { entryBuffers.doneWriting(buffer); } } {code} entryBuffers.getChunkToWrite() won't return null as long as there is data in the buffer Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-splitting and distributed-log-splitting Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work), 400 regions in one hlog file local-master-split:60s+ distributed-log-splitting:165s+ In fact, in our production environment, distributed-log-splitting also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s (30ms~50ms per writer.close(); 10ms per create writers ) I think we could do the improvement for this: Parallelizing the create and close writers in threads In the patch, change the logic for distributed-log-splitting same as the local-master-splitting and parallelizing the close in threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286385#comment-13286385 ] Hadoop QA commented on HBASE-6134: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530341/HBASE-6134v3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.coprocessor.TestClassLoading org.apache.hadoop.hbase.regionserver.wal.TestHLog Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2074//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2074//console This message is automatically generated. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286638#comment-13286638 ] Kannan Muthukkaruppan commented on HBASE-6134: -- The SequenceFile writer does its own buffering. And so, my guess is that, at close time, the flushing of these buffers for every region's recovered.edits (SequenceFile) file is what is causing the close to be slow because it is serialized on these flushes and then DN reporting back to NN that it has received the blocks. Parallelizing the close in threads makes sense. Good catch Chunhui! May I request you to update the description for the JIRA with a summary of what the fix does, so it is useful for future readers of this JIRA. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286691#comment-13286691 ] Zhihong Yu commented on HBASE-6134: --- From https://builds.apache.org/job/PreCommit-HBASE-Build/2074//testReport/org.apache.hadoop.hbase.regionserver.wal/TestHLog/testAppendClose/: {code} java.net.BindException: Problem binding to localhost/127.0.0.1:55283 : Address already in use {code} Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch, HBASE-6134v3.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285498#comment-13285498 ] chunhui shen commented on HBASE-6134: - Review Board https://reviews.apache.org/r/5294/ Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285502#comment-13285502 ] ramkrishna.s.vasudevan commented on HBASE-6134: --- @Chunhui Nice work. bq.Also, we found closing writer is quite slow Any reason why it was slow? Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285511#comment-13285511 ] chunhui shen commented on HBASE-6134: - @ram Through the log, we found writer.close() took about 30ms~50ms. So, it will took about 20s if there are 400 regions in one hlog file. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285513#comment-13285513 ] chunhui shen commented on HBASE-6134: - @ram Could you take a look at your environment, hom much time writer.close() took? Through the log: LOG.info(Closed path + wap.p + (wrote + wap.editsWritten + edits in + (wap.nanosSpent / 1000 / 1000) + ms)); Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285515#comment-13285515 ] ramkrishna.s.vasudevan commented on HBASE-6134: --- Ok. I will check that but may not be today. :) Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285623#comment-13285623 ] Hadoop QA commented on HBASE-6134: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530177/HBASE-6134.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.wal.TestWALReplayCompressed org.apache.hadoop.hbase.regionserver.wal.TestWALReplay Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2051//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2051//console This message is automatically generated. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13285891#comment-13285891 ] Zhihong Yu commented on HBASE-6134: --- Test failed with NPE at: {code} for (FileStatus fileStatus : listStatus1) { {code} Checking for null wouldn't help either: {code} Failed tests: testSequentialEditLogSeqNum(org.apache.hadoop.hbase.regionserver.wal.TestWALReplay): The sequence number of the recoverd.edits and the current edit seq should be same expected:17 but was:0 {code} Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.96.0 Attachments: HBASE-6134.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286313#comment-13286313 ] Hadoop QA commented on HBASE-6134: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12530321/HBASE-6134v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2070//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2070//console This message is automatically generated. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6134) Improvement for split-worker to speed up distributed-split-log
[ https://issues.apache.org/jira/browse/HBASE-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13286320#comment-13286320 ] stack commented on HBASE-6134: -- I added some comments over on rb on latest version of patch. Improvement for split-worker to speed up distributed-split-log -- Key: HBASE-6134 URL: https://issues.apache.org/jira/browse/HBASE-6134 Project: HBase Issue Type: Improvement Components: wal Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.96.0 Attachments: HBASE-6134.patch, HBASE-6134v2.patch First,we do the test between local-master-split and distributed split log Environment:34 hlog files, 5 regionservers,(after kill one, only 4 rs do ths splitting work) local-master-split:60s+ distributed-split-log:165s+ In fact, in our production environment, distributed-split-log also took 60s with 30 regionservers for 34 hlog files (regionserver may be in high load) We found split-worker split one log file took about 20s. I think we should do the improvement for this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira