[
https://issues.apache.org/jira/browse/HBASE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148231#comment-13148231
]
Hudson commented on HBASE-2856:
-------------------------------
Integrated in HBase-TRUNK #2427 (See
[https://builds.apache.org/job/HBase-TRUNK/2427/])
HBASE-3690 Option to Exclude Bulk Import Files from Minor Compaction
Summary:
We ran an incremental scrape with HFileOutputFormat and
encountered major compaction storms. This is caused by the bug in
HBASE-3404. The permanent fix is a little tricky without HBASE-2856. We
realized that a quicker solution for avoiding these compaction storms is
to simply exclude bulk import files from minor compactions and let them
only be handled by time-based major compactions. Add with functionality
along with a config option to enable it.
Rewrote this feature to be done on a per-bulkload basis.
Test Plan:
- mvn test -Dtest=TestHFileOutputFormat
DiffCamp Revision:
Reviewers: stack, Kannan, JIRA, dhruba
Reviewed By: stack
CC: dhruba, lhofhansl, nspiegelberg, stack
Differential Revision: 357
nspiegelberg :
Files :
*
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
*
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
> TestAcidGuarantee broken on trunk
> ----------------------------------
>
> Key: HBASE-2856
> URL: https://issues.apache.org/jira/browse/HBASE-2856
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.89.20100621
> Reporter: ryan rawson
> Assignee: Amitanand Aiyer
> Priority: Blocker
> Fix For: 0.94.0
>
> Attachments: 2856-v2.txt, 2856-v3.txt, 2856-v4.txt, 2856-v5.txt,
> acid.txt
>
>
> TestAcidGuarantee has a test whereby it attempts to read a number of columns
> from a row, and every so often the first column of N is different, when it
> should be the same. This is a bug deep inside the scanner whereby the first
> peek() of a row is done at time T then the rest of the read is done at T+1
> after a flush, thus the memstoreTS data is lost, and previously 'uncommitted'
> data becomes committed and flushed to disk.
> One possible solution is to introduce the memstoreTS (or similarly equivalent
> value) to the HFile thus allowing us to preserve read consistency past
> flushes. Another solution involves fixing the scanners so that peek() is not
> destructive (and thus might return different things at different times alas).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira