[jira] [Commented] (HBASE-6497) Revisit HLog sizing and roll parameters
[ https://issues.apache.org/jira/browse/HBASE-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13429350#comment-13429350 ] Jean-Daniel Cryans commented on HBASE-6497: --- bq. Less parallelization per RS. If you have a lot of RSes, lowering file count does help reduce HBase RPCs too? I'm not sure I understand what you mean. HBase RPCs in which context? Revisit HLog sizing and roll parameters --- Key: HBASE-6497 URL: https://issues.apache.org/jira/browse/HBASE-6497 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Lars George The last major update to the HLog sizing and roll features were done in HBASE-1394. I am proposing to revisit these settings to overcome recent issues where the HLog becomes a major bottleneck. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6497) Revisit HLog sizing and roll parameters
[ https://issues.apache.org/jira/browse/HBASE-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13428544#comment-13428544 ] Harsh J commented on HBASE-6497: bq. Less parallelization during distributed splitting since the unit of distribution is a file. Less parallelization per RS. If you have a lot of RSes, lowering file count does help reduce HBase RPCs too? Revisit HLog sizing and roll parameters --- Key: HBASE-6497 URL: https://issues.apache.org/jira/browse/HBASE-6497 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Lars George The last major update to the HLog sizing and roll features were done in HBASE-1394. I am proposing to revisit these settings to overcome recent issues where the HLog becomes a major bottleneck. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6497) Revisit HLog sizing and roll parameters
[ https://issues.apache.org/jira/browse/HBASE-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427130#comment-13427130 ] Lars George commented on HBASE-6497: The goal in designing a proper HBase schema is to maximize heap usage across all regions, which can lead to the situation where the WALs (aka HLog's) are required to be kept for a considerable amount of time. The last iteration on WAL properties added a configurable block size, as well as threshold percentage to roll the log before it completely fills the single HDFS block (see HBASE-1394). I am questioning if this is still in issue, maybe even in the light of recent improvements on log performance, for example HBASE-5699 and HBASE-4608. At the least, I would like to figure out, if we should increase the WAL size to 512MB, to avoid getting into early flushing situations, impacting the overall I/O. Isn't HBASE-1364 helping to split larger logs (though not the logs themselves but distributed across the region servers obviously). I am not sure if the log splitting prefers block local nodes first, so that there is no remote reading though. Questions: # Is there a need to keep the logs small (typically 64-128 depending on the HDFS config)? # Should we go multiple blocks? # Do we still need the logroll multiplier? # Should we increase the maxlogs number (default is 32)? Revisit HLog sizing and roll parameters --- Key: HBASE-6497 URL: https://issues.apache.org/jira/browse/HBASE-6497 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Lars George The last major update to the HLog sizing and roll features were done in HBASE-1394. I am proposing to revisit these settings to overcome recent issues where the HLog becomes a major bottleneck. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6497) Revisit HLog sizing and roll parameters
[ https://issues.apache.org/jira/browse/HBASE-6497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13427615#comment-13427615 ] Jean-Daniel Cryans commented on HBASE-6497: --- bq. Is there a need to keep the logs small (typically 64-128 depending on the HDFS config)? bq. If current is 128 MB x 32 = 4096 MB (4 GB) of logs approx. before full flush, then lets change that to have fewer than 32 files (reduces NN RPCs during recovery and increases the sequential read length) on to 8 maxlogs at 512 MB default size (8x512 = 4096 again). Issues with bigger files while having less of them: - Less parallelization during distributed splitting since the unit of distribution is a file. - Less opportunities to get rid of logs without having to force flush regions. The worst case would be having max 1 file meaning that when you roll you need to force flush everything that hasn't been flushed yet. Revisit HLog sizing and roll parameters --- Key: HBASE-6497 URL: https://issues.apache.org/jira/browse/HBASE-6497 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Lars George The last major update to the HLog sizing and roll features were done in HBASE-1394. I am proposing to revisit these settings to overcome recent issues where the HLog becomes a major bottleneck. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira