[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641489#comment-13641489 ] Devaraj Das commented on HBASE-5930: Cool. This looks like it exactly does what you stated in your last comment. +1 (not sure why releaseaudit failed, even though the structure of the patch is essentially the same as what I submitted before). I'd like to commit this tomorrow morning unless I hear otherwise from anyone. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641987#comment-13641987 ] ramkrishna.s.vasudevan commented on HBASE-5930: --- +1 on the Lars patch. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641993#comment-13641993 ] Devaraj Das commented on HBASE-5930: Don't think the releaseaudit warning is caused by the last patch. It seems to be there in the other builds prior to this pre-commit build as well (is HBASE-8431 the fix?). I ran the releaseaudit test locally and it passed. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642019#comment-13642019 ] Lars Hofhansl commented on HBASE-5930: -- Whoa. You guys are fast. This was more of a sample patch :) I'll do a bit more double checking today. I would also like to have this 0.94. Any objections to that? Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642022#comment-13642022 ] Devaraj Das commented on HBASE-5930: Sure Lars, do the due diligence from your side. No objections for commit to 0.94. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.98.0, 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640707#comment-13640707 ] Devaraj Das commented on HBASE-5930: Yes, Lars, I think the patch does what you talk about. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640745#comment-13640745 ] Lars Hofhansl commented on HBASE-5930: -- Sorry if I seem difficult with this one... How about a theme like this: * We record the time of the first edit made to the memstore since the last flush. We can even improve this and only record the time of the last unlogged edit made. * Periodically we run the chore, if the recorded time of that first edit is older than a configurable X then we flush the memstore. That would: # be simpler # clearly limit the maximum an edit will stay in the memstore without being flushed Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641298#comment-13641298 ] Devaraj Das commented on HBASE-5930: :-) Lars, I'd appreciate if you kindly take a look at the patch. The patch as it stands is simple and also limits the maximum time an edit will stay un-flushed. The high level outline is here https://issues.apache.org/jira/browse/HBASE-5930?focusedCommentId=13571813page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13571813 (I have also put detailed comments in the shouldFlush method added in the patch) Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641356#comment-13641356 ] Lars Hofhansl commented on HBASE-5930: -- Why is the approach in the patch better than what I have described? I believe the approach I described is better in the following ways: * The logic is simpler * We directly measure the age of oldest edit in the memstore, which is the exact metric we want to limit * We only have track the current time for the first KV inserted into the memstore after a flush (System.currentTimeMillis() is not free) I'm happy to make a sample patch, then we can decide on the merit of the two patches. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641429#comment-13641429 ] Hadoop QA commented on HBASE-5930: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580468/5930-track-oldest-sample.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings (more than the trunk's current 0 warnings). {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5444//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639213#comment-13639213 ] stack commented on HBASE-5930: -- [~lhofhansl] ping. Question for you in above. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639805#comment-13639805 ] Hadoop QA commented on HBASE-5930: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568108/5930-2.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5418//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639961#comment-13639961 ] Hadoop QA commented on HBASE-5930: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12580196/5930-2.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/5422//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640019#comment-13640019 ] Lars Hofhansl commented on HBASE-5930: -- I'd have to reread the patch. The semantics that we should achieve is a maximum time for an unlogged KV to remain in the memstore (this is different from periodic flushing - I misnamed this issue). Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.1 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616335#comment-13616335 ] Nicolas Liochon commented on HBASE-5930: This one has been forgotten. [~lhofhansl], do you have any opinion on the patch? [~devaraj], I can finish the work (if any :-) ) if you're busy. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616417#comment-13616417 ] Devaraj Das commented on HBASE-5930: Hey [~nkeywal], thanks for bringing this up. I was actually waiting for [~lhofhansl] to get back with his opinion on the latest patch... Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.95.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573705#comment-13573705 ] Hadoop QA commented on HBASE-5930: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568108/5930-2.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.security.access.TestAccessControlFilter Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4369//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572632#comment-13572632 ] Devaraj Das commented on HBASE-5930: I should add that making the checks as described before prevents some potentially unneeded flushes, while bounding the max duration an edit lives in the memstore... [~lhofhansl], could you please take a look at the patch. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571929#comment-13571929 ] Hadoop QA commented on HBASE-5930: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12568108/5930-2.3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestAdmin Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4344//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-2.3.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565003#comment-13565003 ] Hadoop QA commented on HBASE-5930: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566881/5930-2.2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.TestHeapSize Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4226//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564097#comment-13564097 ] Lars Hofhansl commented on HBASE-5930: -- Hmm... This is a bit more difficult than I thought. I think what we want to limit is this: The maximum time an unflushed edit will remain in the memstore. Otherwise one could trickle in edit 1 every hour and get very old data in the memstore. (Doing that could potentially also be cheaper as we do not need to retrieve the current time on each edit, just the first one after a flush). If that is true, then what we want track is not the time of the newest edit, but the time of oldest unflushed edit, and flush if that gets too old. In order to avoid flushing all memstores at the same time, we want to offset the memstores flush times. We can do it the way you have it. (but it seems natural to me to do that at the place where we detect that the memstore needs to be flushed. For this to work the chore needs to wake up more frequently than the flush interval.) Btw. the flush interval you have a 10mins, not 1h. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562933#comment-13562933 ] Hadoop QA commented on HBASE-5930: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12566533/5930-2.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.io.TestHeapSize Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/4182//console This message is automatically generated. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563163#comment-13563163 ] Ted Yu commented on HBASE-5930: --- {code} + private Random rand = new Random(); {code} Please use SecureRandom. {code} + } catch (InterruptedException ie){ +//ignore {code} Please restore interrupt status. Should upper bound for the sleep take length of MemStoreFlusher.flushQueue into consideration ? When many FlushQueueEntry's pile up in flushQueue, we may want to wait longer. Also, the sleep should be bounded by the remaining time w.r.t. cacheFlushInterval - we don't want the loop in chore() to outlast cacheFlushInterval. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563193#comment-13563193 ] Lars Hofhansl commented on HBASE-5930: -- Absolutely not use SecureRandom here. We're not using this to generate cryptographics keys, but just some jitter for memstore flush timing, right? SecureRandom will exhaust your locally generated entropy that is much better used in case where it is actually needed (and it can hang - on Linux at least - if not enough entropy has been collected) Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563217#comment-13563217 ] Lars Hofhansl commented on HBASE-5930: -- I think the delay should algorithmically related to the flush interval (like interval / 3 or something, could make the jitter factor configurable). Could we fold the jitter into shouldFlush() rather than actually waiting in chore()? Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563225#comment-13563225 ] Devaraj Das commented on HBASE-5930: bq. I think the delay should algorithmically related to the flush interval I think what I currently have has a certain advantage - like if the configured value of cacheflushinterval is too low or something, the chore will be triggered very often but the sleep interval (0 - 2 minutes) would keep the #flushes under control. But yeah, I can always enforce a minimum delay before each flush. bq. Could we fold the jitter into shouldFlush() rather than actually waiting in chore()? I think that shouldFlush shouldn't be involved in determining how much to delay the flush. Do you see any issues in waiting in the chore? Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563261#comment-13563261 ] Devaraj Das commented on HBASE-5930: bq. Should upper bound for the sleep take length of MemStoreFlusher.flushQueue into consideration ? [~yuzhih...@gmail.com], I think we don't have to worry about this one as much. The reason being that there is a random delay before each flush is inserted in the queue (as opposed to inserts coming in at a rate faster than what the flusher can handle). bq. Also, the sleep should be bounded by the remaining time w.r.t. cacheFlushInterval - we don't want the loop in chore() to outlast cacheFlushInterval. This should be fine. I don't see issues with this one. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561495#comment-13561495 ] nkeywal commented on HBASE-5930: bq. With this, maybe we will no longer need skipWAL if we can prove that deferred flush is as fast as skip WAL. In standard database, skipping the WAL is often used when you're doing a functional upgrade requiring some unavailability time, i.e.: - dump - run batch scripts to update your data - if anything goes wrong reload the dump For hundreds of reasons it makes much less sense with HBase, but it could happen (some companies don't need 24x24). So we should not remove the skipWAL imho, except if it really simplify something internally. On the patch itself, I have a question on adding some randomness. The scenario I'm thinking about is a massive but periodic update on a table: all the regions will be written simultaneously, hence flushed simultaneously. That's the main use case for this JIRA, and this could hammer the namenode, imho. Except if we thing there is enough randomness by having a different flusher by regionserver (which may not be the case if all regions servers are started simultaneously). As a side note, I would personally like a flush interval of 10 minutes: - it would help on .META. recovery, especially with the separate wal for .META. - this allows to have more regions: today, on average and in theory, each region takes 50% of an hdfs block size of memory. The more regions we flush early, the more empty memstore we have... Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561739#comment-13561739 ] Lars Hofhansl commented on HBASE-5930: -- How can deferred log flush ever be as fast as not writing the WAL at all? Considering only the latency of a single request that might be true in many cases, but it will definitely not be true on a busy cluster since all data is written to the disks twice. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561810#comment-13561810 ] Ted Yu commented on HBASE-5930: --- Where is PeriodicMemstoreFlusher instantiated ? Currently MEMSTORE_PERIODIC_FLUSH_INTERVAL is read by both HRegion and PeriodicMemstoreFlusher. {code} + boolean shouldFlush() { {code} Can we pass the interval to the above method so that HRegion doesn't need to introduce: {code} + private long flushCheckInterval; {code} What value for MEMSTORE_PERIODIC_FLUSH_INTERVAL would be interpreted as disabling the periodic flush ? Thanks Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561837#comment-13561837 ] Devaraj Das commented on HBASE-5930: [~lhofhansl], yeah what [~enis] meant IMHO is that the latency from the client's point of view would improve when deferred flush is used for the mutations. Also, we considered the case that users would most likely not want to skip WAL if we promise them that there wouldn't be latency issues (maybe on a busy cluster). But yeah, it'd not make a difference on the overall IOPS in the cluster... [~nkeywal], generally agree with you that we should not remove the skipWal option without giving it a real good thought and before considering more use cases. And, yes the idea of randomizing the flushes across regionservers sounds good. I'll think up how to incorporate that. [~yuzhih...@gmail.com], good catch on the instantiation :) I was focusing on getting the logic right; forgot to instantiate the chore. I'd prefer to leave the shouldFlush() signature as is (it's a matter of implementation that the shouldFlush method implementation is using the same constant underneath but it could be very well a different constant or shouldFlush implementation could be different sometime when this constant is not even used..). Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562192#comment-13562192 ] Ted Yu commented on HBASE-5930: --- w.r.t. randomizing the flushes across regionservers, one approach is to introduce a new znode whose data is the outstanding count of flush requests, cluster wise. We place an upper bound on this count. PeriodicMemstoreFlusher wouldn't create new request if the count is at upper bound. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562213#comment-13562213 ] Lars Hofhansl commented on HBASE-5930: -- That would work (using a znode). I do think it's fine to place an upper limit per regionserver, and maybe we won't need an upper limit at all. I so like the idea of some randomness. We could stagger per memstore and add a random jigger that could be up to 1/2 (just making this up, though) of the flush interval. We'd get a new random jigger number after each flush and at memstore creation. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-1.patch, 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561302#comment-13561302 ] Enis Soztutar commented on HBASE-5930: -- bq. I would like to pick this up again and add a flag to Mutation to indicate deferred WAL sync. If HRegion receives a batch of Mutation of which at least one is not marked as deferred the log is sync'ed. Otherwise it is deferred. I like the idea of having a deferred flush at the Put level. Now the weird thing is that it is per table, not per column family. I guess we can have per-table/per-cf or per batch deferred flush setting. With this, maybe we will no longer need skipWAL if we can prove that deferred flush is as fast as skip WAL. Most of the time, we actually do not want to skip WAL, we just want a deferred flush. bq. I decided to separate the issue of having the feature on asynchronous write to WAL from the periodic flush +1 on doing separating the two. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561358#comment-13561358 ] Lars Hofhansl commented on HBASE-5930: -- I'd be fine with this in 0.94 as well. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor Fix For: 0.96.0 Attachments: 5930-wip.patch A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560237#comment-13560237 ] Enis Soztutar commented on HBASE-5930: -- Regardless of whether mutations have deferred sync, we might always want to flush periodically. We are rolling the WAL periodically, but if we do not flush, we may end up with a lof of hlogs to recover from in case of failover. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560253#comment-13560253 ] Devaraj Das commented on HBASE-5930: Yes, [~enis], that's my plan.. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Assignee: Devaraj Das Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499603#comment-13499603 ] Lars Hofhansl commented on HBASE-5930: -- I would like to pick this up again and add a flag to Mutation to indicate deferred WAL sync. If HRegion receives a batch of Mutation of which at least one is not marked as deferred the log is sync'ed. Otherwise it is deferred. This will mingle well later with HBASE-5954. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410457#comment-13410457 ] nkeywal commented on HBASE-5930: I also think that a periodic memstore flush. Even with a WAL, it's seems safer/more efficient. It seems that HBase had this a long long time ago: {noformat} property namehbase.regionserver.optionalcacheflushinterval/name value180/value description Amount of time to wait since the last time a region was flushed before invoking an optional cache flush (An optional cache flush is a flush even though memcache is not at the memcache.flush.size). Default: 30 minutes (in miliseconds) /description /property {noformat} It could also be linked to major compactions (before a major compaction, flush 'old' memstore)? Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268995#comment-13268995 ] Andrew Purtell commented on HBASE-5930: --- +1 We basically do the same thing as proposed but on the client side with a shared DAO layer. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269025#comment-13269025 ] Matt Corgan commented on HBASE-5930: Periodically flushing the memstore seems like a good feature to me. Could also help clear out cold data from memory to make more room for bigger memstores on regions that are actually being used. A different solution to the underlying data loss issue might be to have a third client setting for WAL writing: NONE, SYNC, and ASYNC. ASYNC would write the data to a memory buffer, return success to the client, and another thread would flush the buffer to the WAL. The WAL would ideally only lag a few seconds behind the memstores, but some form of throttling would probably be needed. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269070#comment-13269070 ] stack commented on HBASE-5930: -- Is our deferred flush == ASYNC described above? Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269074#comment-13269074 ] Lars Hofhansl commented on HBASE-5930: -- That (deferred flush) is what I told my colleague to use last week. Would be nice if the client could control this (in addition to writeToWal, we could have writeToWalAsynchronously - or something). A periodic memstore flush still make sense. If I get some time next week I'll come up with a patch (unless somebody else wants to take this :) ). Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269116#comment-13269116 ] stack commented on HBASE-5930: -- I like idea of client saying whether to put it on deferred flush queue or whether its to be flushed immediately. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267754#comment-13267754 ] Todd Lipcon commented on HBASE-5930: Seems reasonable to flush the memstore if it's had no write activity at all in N minutes. Then it shouldn't lead to smaller storefiles, right? Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267765#comment-13267765 ] Lars Hofhansl commented on HBASE-5930: -- What should trigger the flush is an interesting discussion in itself. Should we flush: * after N timeunits of write inactivity, or * when the last flush happened more than N TUs ago The former would avoid smaller storefiles, the latter would put a limit on how stale an entry in the memstore can be. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267812#comment-13267812 ] Matt Corgan commented on HBASE-5930: Maybe add a boolean to the memstore to track if it contains edits that were not written to the WAL. No need to auto-flush in the frequent case where all edits are in the WAL. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267826#comment-13267826 ] Jean-Daniel Cryans commented on HBASE-5930: --- bq. No need to auto-flush in the frequent case where all edits are in the WAL. And we already roll every hour. From LogRoller: bq. this.rollperiod = this.server.getConfiguration().getLong(hbase.regionserver.logroll.period, 360); Meaning that your data in the WAL can only be sitting there for so long. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267831#comment-13267831 ] Todd Lipcon commented on HBASE-5930: bq. Maybe add a boolean to the memstore to track if it contains edits that were not written to the WAL HBASE-5886 adds code which tracks how much un-WAL-ed data is in the memstore. bq. Meaning that your data in the WAL can only be sitting there for so long. But if we retain 20 or so HLogs, and we roll only every hour, then we still have 20 hours worth of data sitting there unflushed, which might be a little strange if the cluster is entirely idle. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?
[ https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268054#comment-13268054 ] binlijin commented on HBASE-5930: - This feature looks good. Periodically flush the Memstore? Key: HBASE-5930 URL: https://issues.apache.org/jira/browse/HBASE-5930 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Priority: Minor A colleague of mine ran into an interesting issue. He inserted some data with the WAL disabled, which happened to fit in the aggregate Memstores memory. Two weeks later he a had problem with the HDFS cluster, which caused the region servers to abort. He found that his data was lost. Looking at the log we found that the Memstores were not flushed at all during these two weeks. Should we have an option to flush memstores periodically. There are obvious downsides to this, like many small storefiles, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira