[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641489#comment-13641489
 ] 

Devaraj Das commented on HBASE-5930:


Cool. This looks like it exactly does what you stated in your last comment. +1 
(not sure why releaseaudit failed, even though the structure of the patch is 
essentially the same as what I submitted before). I'd like to commit this 
tomorrow morning unless I hear otherwise from anyone.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-25 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641987#comment-13641987
 ] 

ramkrishna.s.vasudevan commented on HBASE-5930:
---

+1 on the Lars patch. 

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641993#comment-13641993
 ] 

Devaraj Das commented on HBASE-5930:


Don't think the releaseaudit warning is caused by the last patch. It seems to 
be there in the other builds prior to this pre-commit build as well (is 
HBASE-8431 the fix?). I ran the releaseaudit test locally and it passed.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642019#comment-13642019
 ] 

Lars Hofhansl commented on HBASE-5930:
--

Whoa. You guys are fast. This was more of a sample patch :)
I'll do a bit more double checking today.

I would also like to have this 0.94. Any objections to that?

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13642022#comment-13642022
 ] 

Devaraj Das commented on HBASE-5930:


Sure Lars, do the due diligence from your side. No objections for commit to 
0.94.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.98.0, 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640707#comment-13640707
 ] 

Devaraj Das commented on HBASE-5930:


Yes, Lars, I think the patch does what you talk about.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640745#comment-13640745
 ] 

Lars Hofhansl commented on HBASE-5930:
--

Sorry if I seem difficult with this one... How about a theme like this:
* We record the time of the first edit made to the memstore since the last 
flush. We can even improve this and only record the time of the last unlogged 
edit made.
* Periodically we run the chore, if the recorded time of that first edit is 
older than a configurable X then we flush the memstore.

That would:
# be simpler
# clearly limit the maximum an edit will stay in the memstore without being 
flushed


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641298#comment-13641298
 ] 

Devaraj Das commented on HBASE-5930:


:-)

Lars, I'd appreciate if you kindly take a look at the patch. The patch as it 
stands is simple and also limits the maximum time an edit will stay un-flushed. 
The high level outline is here 
https://issues.apache.org/jira/browse/HBASE-5930?focusedCommentId=13571813page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13571813
 (I have also put detailed comments in the shouldFlush method added in the 
patch)

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641356#comment-13641356
 ] 

Lars Hofhansl commented on HBASE-5930:
--

Why is the approach in the patch better than what I have described?

I believe the approach I described is better in the following ways:
* The logic is simpler
* We directly measure the age of oldest edit in the memstore, which is the 
exact metric we want to limit
* We only have track the current time for the first KV inserted into the 
memstore after a flush (System.currentTimeMillis() is not free)

I'm happy to make a sample patch, then we can decide on the merit of the two 
patches.


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13641429#comment-13641429
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12580468/5930-track-oldest-sample.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5444//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-track-oldest-sample.txt, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-23 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639213#comment-13639213
 ] 

stack commented on HBASE-5930:
--

[~lhofhansl] ping.  Question for you in above.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639805#comment-13639805
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12568108/5930-2.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5418//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13639961#comment-13639961
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12580196/5930-2.4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/5422//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-04-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640019#comment-13640019
 ] 

Lars Hofhansl commented on HBASE-5930:
--

I'd have to reread the patch.
The semantics that we should achieve is a maximum time for an unlogged KV to 
remain in the memstore (this is different from periodic flushing - I misnamed 
this issue).

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.1

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-2.4.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-03-28 Thread Nicolas Liochon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616335#comment-13616335
 ] 

Nicolas Liochon commented on HBASE-5930:


This one has been forgotten. [~lhofhansl], do you have any opinion on the 
patch? [~devaraj], I can finish the work (if any :-) ) if you're busy.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-03-28 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616417#comment-13616417
 ] 

Devaraj Das commented on HBASE-5930:


Hey [~nkeywal], thanks for bringing this up. I was actually waiting for 
[~lhofhansl] to get back with his opinion on the latest patch... 

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.95.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-02-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573705#comment-13573705
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12568108/5930-2.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.security.access.TestAccessControlFilter

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4369//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-02-06 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572632#comment-13572632
 ] 

Devaraj Das commented on HBASE-5930:


I should add that making the checks as described before prevents some 
potentially unneeded flushes, while bounding the max duration an edit lives in 
the memstore... 

[~lhofhansl], could you please take a look at the patch.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-02-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571929#comment-13571929
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12568108/5930-2.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestAdmin

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4344//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-2.3.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565003#comment-13565003
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566881/5930-2.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.TestHeapSize

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4226//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-2.2.patch, 
 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-27 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13564097#comment-13564097
 ] 

Lars Hofhansl commented on HBASE-5930:
--

Hmm... This is a bit more difficult than I thought.

I think what we want to limit is this: The maximum time an unflushed edit will 
remain in the memstore. Otherwise one could trickle in edit 1 every hour and 
get very old data in the memstore.
(Doing that could potentially also be cheaper as we do not need to retrieve the 
current time on each edit, just the first one after a flush).

If that is true, then what we want track is not the time of the newest edit, 
but the time of oldest unflushed edit, and flush if that gets too old.
In order to avoid flushing all memstores at the same time, we want to offset 
the memstores flush times.
We can do it the way you have it.
(but it seems natural to me to do that at the place where we detect that the 
memstore needs to be flushed. For this to work the chore needs to wake up more 
frequently than the flush interval.)

Btw. the flush interval you have a 10mins, not 1h.


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562933#comment-13562933
 ] 

Hadoop QA commented on HBASE-5930:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12566533/5930-2.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.io.TestHeapSize

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/4182//console

This message is automatically generated.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-25 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563163#comment-13563163
 ] 

Ted Yu commented on HBASE-5930:
---

{code}
+  private Random rand = new Random();
{code}
Please use SecureRandom.
{code}
+  } catch (InterruptedException ie){
+//ignore
{code}
Please restore interrupt status.

Should upper bound for the sleep take length of MemStoreFlusher.flushQueue into 
consideration ?
When many FlushQueueEntry's pile up in flushQueue, we may want to wait longer.

Also, the sleep should be bounded by the remaining time w.r.t. 
cacheFlushInterval - we don't want the loop in chore() to outlast 
cacheFlushInterval.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563193#comment-13563193
 ] 

Lars Hofhansl commented on HBASE-5930:
--

Absolutely not use SecureRandom here. We're not using this to generate 
cryptographics keys, but just some jitter for memstore flush timing, right?
SecureRandom will exhaust your locally generated entropy that is much better 
used in case where it is actually needed (and it can hang - on Linux at least - 
if not enough entropy has been collected)

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-25 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563217#comment-13563217
 ] 

Lars Hofhansl commented on HBASE-5930:
--

I think the delay should algorithmically related to the flush interval (like 
interval / 3 or something, could make the jitter factor configurable).
Could we fold the jitter into shouldFlush() rather than actually waiting in 
chore()?


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563225#comment-13563225
 ] 

Devaraj Das commented on HBASE-5930:


bq. I think the delay should algorithmically related to the flush interval
I think what I currently have has a certain advantage - like if the configured 
value of cacheflushinterval is too low or something, the chore will be 
triggered very often but the sleep interval (0 - 2 minutes) would keep the 
#flushes under control. But yeah, I can always enforce a minimum delay before 
each flush.
 
bq. Could we fold the jitter into shouldFlush() rather than actually waiting in 
chore()?
I think that shouldFlush shouldn't be involved in determining how much to delay 
the flush. Do you see any issues in waiting in the chore?

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-25 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13563261#comment-13563261
 ] 

Devaraj Das commented on HBASE-5930:


bq. Should upper bound for the sleep take length of MemStoreFlusher.flushQueue 
into consideration ?

[~yuzhih...@gmail.com], I think we don't have to worry about this one as much. 
The reason being that there is a random delay before each flush is inserted in 
the queue (as opposed to inserts coming in at a rate faster than what the 
flusher can handle).

bq. Also, the sleep should be bounded by the remaining time w.r.t. 
cacheFlushInterval - we don't want the loop in chore() to outlast 
cacheFlushInterval.

This should be fine. I don't see issues with this one.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-2.1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-24 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561495#comment-13561495
 ] 

nkeywal commented on HBASE-5930:


bq. With this, maybe we will no longer need skipWAL if we can prove that 
deferred flush is as fast as skip WAL. 
In standard database, skipping the WAL is often used when you're doing a 
functional upgrade requiring some unavailability time, i.e.:
- dump
- run batch scripts to update your data
- if anything goes wrong reload the dump

For hundreds of reasons it makes much less sense with HBase, but it could 
happen (some companies don't need 24x24). So we should not remove the skipWAL 
imho, except if it really simplify something internally.


On the patch itself, I have a question on adding some randomness. The scenario 
I'm thinking about is a massive but periodic update on a table: all the regions 
will be written simultaneously, hence flushed simultaneously. That's the main 
use case for this JIRA, and this could hammer the namenode, imho. Except if we 
thing there is enough randomness by having a different flusher by regionserver 
(which may not be the case if all regions servers are started simultaneously). 

As a side note, I would personally like a flush interval of 10 minutes:
- it would help on .META. recovery, especially with the separate wal for .META.
- this allows to have more regions: today, on average and in theory, each 
region takes 50% of an hdfs block size of memory. The more regions we flush 
early, the more empty memstore we have...

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561739#comment-13561739
 ] 

Lars Hofhansl commented on HBASE-5930:
--

How can deferred log flush ever be as fast as not writing the WAL at all?
Considering only the latency of a single request that might be true in many 
cases, but it will definitely not be true on a busy cluster since all data is 
written to the disks twice.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561810#comment-13561810
 ] 

Ted Yu commented on HBASE-5930:
---

Where is PeriodicMemstoreFlusher instantiated ?
Currently MEMSTORE_PERIODIC_FLUSH_INTERVAL is read by both HRegion and 
PeriodicMemstoreFlusher.
{code}
+  boolean shouldFlush() {
{code}
Can we pass the interval to the above method so that HRegion doesn't need to 
introduce:
{code}
+  private long flushCheckInterval;
{code}
What value for MEMSTORE_PERIODIC_FLUSH_INTERVAL would be interpreted as 
disabling the periodic flush ?

Thanks

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-24 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561837#comment-13561837
 ] 

Devaraj Das commented on HBASE-5930:


[~lhofhansl], yeah what [~enis] meant IMHO is that the latency from the 
client's point of view would improve when deferred flush is used for the 
mutations. Also, we considered the case that users would most likely not want 
to skip WAL if we promise them that there wouldn't be latency issues (maybe on 
a busy cluster). But yeah, it'd not make a difference on the overall IOPS in 
the cluster...

[~nkeywal], generally agree with you that we should not remove the skipWal 
option without giving it a real good thought and before considering more use 
cases. And, yes the idea of randomizing the flushes across regionservers sounds 
good. I'll think up how to incorporate that.

[~yuzhih...@gmail.com], good catch on the instantiation :) I was focusing on 
getting the logic right; forgot to instantiate the chore. I'd prefer to leave 
the shouldFlush() signature as is (it's a matter of implementation that the 
shouldFlush method implementation is using the same constant underneath but it 
could be very well a different constant or shouldFlush implementation could be 
different sometime when this constant is not even used..).

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-24 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562192#comment-13562192
 ] 

Ted Yu commented on HBASE-5930:
---

w.r.t. randomizing the flushes across regionservers, one approach is to 
introduce a new znode whose data is the outstanding count of flush requests, 
cluster wise. We place an upper bound on this count. PeriodicMemstoreFlusher 
wouldn't create new request if the count is at upper bound.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-24 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13562213#comment-13562213
 ] 

Lars Hofhansl commented on HBASE-5930:
--

That would work (using a znode). I do think it's fine to place an upper limit 
per regionserver, and maybe we won't need an upper limit at all.
I so like the idea of some randomness. We could stagger per memstore and add a 
random jigger that could be up to 1/2 (just making this up, though) of the 
flush interval. We'd get a new random jigger number after each flush and at 
memstore creation.


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-1.patch, 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-23 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561302#comment-13561302
 ] 

Enis Soztutar commented on HBASE-5930:
--

bq. I would like to pick this up again and add a flag to Mutation to indicate 
deferred WAL sync. If HRegion receives a batch of Mutation of which at least 
one is not marked as deferred the log is sync'ed. Otherwise it is deferred.
I like the idea of having a deferred flush at the Put level. Now the weird 
thing is that it is per table, not per column family. I guess we can have 
per-table/per-cf or per batch deferred flush setting. 
With this, maybe we will no longer need skipWAL if we can prove that deferred 
flush is as fast as skip WAL. Most of the time, we actually do not want to skip 
WAL, we just want a deferred flush.
bq. I decided to separate the issue of having the feature on asynchronous write 
to WAL from the periodic flush
+1 on doing separating the two. 



 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13561358#comment-13561358
 ] 

Lars Hofhansl commented on HBASE-5930:
--

I'd be fine with this in 0.94 as well.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor
 Fix For: 0.96.0

 Attachments: 5930-wip.patch


 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-22 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560237#comment-13560237
 ] 

Enis Soztutar commented on HBASE-5930:
--

Regardless of whether mutations have deferred sync, we might always want to 
flush periodically. We are rolling the WAL periodically, but if we do not 
flush, we may end up with a lof of hlogs to recover from in case of failover. 

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2013-01-22 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13560253#comment-13560253
 ] 

Devaraj Das commented on HBASE-5930:


Yes, [~enis], that's my plan..

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Devaraj Das
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-11-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13499603#comment-13499603
 ] 

Lars Hofhansl commented on HBASE-5930:
--

I would like to pick this up again and add a flag to Mutation to indicate 
deferred WAL sync. If HRegion receives a batch of Mutation of which at least 
one is not marked as deferred the log is sync'ed. Otherwise it is deferred.
This will mingle well later with HBASE-5954.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-07-10 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13410457#comment-13410457
 ] 

nkeywal commented on HBASE-5930:


I also think that a periodic memstore flush. Even with a WAL, it's seems 
safer/more efficient.
It seems that HBase had this a long long time ago:

{noformat}
  property
namehbase.regionserver.optionalcacheflushinterval/name
value180/value
description
Amount of time to wait since the last time a region was flushed before
invoking an optional cache flush (An optional cache flush is a
flush even though memcache is not at the memcache.flush.size).
Default: 30 minutes (in miliseconds)
/description
  /property
{noformat}

It could also be linked to major compactions (before a major compaction, flush 
'old' memstore)?

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268995#comment-13268995
 ] 

Andrew Purtell commented on HBASE-5930:
---

+1 We basically do the same thing as proposed but on the client side with a 
shared DAO layer.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-05 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269025#comment-13269025
 ] 

Matt Corgan commented on HBASE-5930:


Periodically flushing the memstore seems like a good feature to me.  Could also 
help clear out cold data from memory to make more room for bigger memstores on 
regions that are actually being used.

A different solution to the underlying data loss issue might be to have a third 
client setting for WAL writing: NONE, SYNC, and ASYNC.  ASYNC would write the 
data to a memory buffer, return success to the client, and another thread would 
flush the buffer to the WAL.  The WAL would ideally only lag a few seconds 
behind the memstores, but some form of throttling would probably be needed.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269070#comment-13269070
 ] 

stack commented on HBASE-5930:
--

Is our deferred flush == ASYNC described above?

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-05 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269074#comment-13269074
 ] 

Lars Hofhansl commented on HBASE-5930:
--

That (deferred flush) is what I told my colleague to use last week.
Would be nice if the client could control this (in addition to writeToWal, we 
could have writeToWalAsynchronously - or something).

A periodic memstore flush still make sense. If I get some time next week I'll 
come up with a patch (unless somebody else wants to take this :) ).

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13269116#comment-13269116
 ] 

stack commented on HBASE-5930:
--

I like idea of client saying whether to put it on deferred flush queue or 
whether its to be flushed immediately.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-03 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267754#comment-13267754
 ] 

Todd Lipcon commented on HBASE-5930:


Seems reasonable to flush the memstore if it's had no write activity at all in 
N minutes. Then it shouldn't lead to smaller storefiles, right?

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267765#comment-13267765
 ] 

Lars Hofhansl commented on HBASE-5930:
--

What should trigger the flush is an interesting discussion in itself. Should we 
flush:
* after N timeunits of write inactivity, or
* when the last flush happened more than N TUs ago

The former would avoid smaller storefiles, the latter would put a limit on how 
stale an entry in the memstore can be.


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-03 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267812#comment-13267812
 ] 

Matt Corgan commented on HBASE-5930:


Maybe add a boolean to the memstore to track if it contains edits that were not 
written to the WAL.  No need to auto-flush in the frequent case where all edits 
are in the WAL.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-03 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267826#comment-13267826
 ] 

Jean-Daniel Cryans commented on HBASE-5930:
---

bq. No need to auto-flush in the frequent case where all edits are in the WAL.

And we already roll every hour. From LogRoller:

bq. this.rollperiod = 
this.server.getConfiguration().getLong(hbase.regionserver.logroll.period, 
360);

Meaning that your data in the WAL can only be sitting there for so long.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-03 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13267831#comment-13267831
 ] 

Todd Lipcon commented on HBASE-5930:


bq. Maybe add a boolean to the memstore to track if it contains edits that were 
not written to the WAL

HBASE-5886 adds code which tracks how much un-WAL-ed data is in the memstore.

bq. Meaning that your data in the WAL can only be sitting there for so long.

But if we retain 20 or so HLogs, and we roll only every hour, then we still 
have 20 hours worth of data sitting there unflushed, which might be a little 
strange if the cluster is entirely idle.


 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5930) Periodically flush the Memstore?

2012-05-03 Thread binlijin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268054#comment-13268054
 ] 

binlijin commented on HBASE-5930:
-

This feature looks good.

 Periodically flush the Memstore?
 

 Key: HBASE-5930
 URL: https://issues.apache.org/jira/browse/HBASE-5930
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Priority: Minor

 A colleague of mine ran into an interesting issue.
 He inserted some data with the WAL disabled, which happened to fit in the 
 aggregate Memstores memory.
 Two weeks later he a had problem with the HDFS cluster, which caused the 
 region servers to abort. He found that his data was lost. Looking at the log 
 we found that the Memstores were not flushed at all during these two weeks.
 Should we have an option to flush memstores periodically. There are obvious 
 downsides to this, like many small storefiles, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira