[ 
https://issues.apache.org/jira/browse/HBASE-12363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14193026#comment-14193026
 ] 

Hadoop QA commented on HBASE-12363:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12678670/12363-master.txt
  against trunk revision .
  ATTACHMENT ID: 12678670

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 27 new 
or modified tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

                {color:red}-1 checkstyle{color}.  The applied patch generated 
3784 checkstyle errors (more than the trunk's current 3781 errors).

    {color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 1 release 
audit warnings (more than the trunk's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
    +    return setValue(KEEP_DELETED_CELLS, (keepDeletedCells ? 
KeepDeletedCells.TRUE : KeepDeletedCells.FALSE).toString());
+    this.keepDeletedCells = scan.isRaw() ? KeepDeletedCells.TRUE : isUserScan 
? KeepDeletedCells.FALSE : scanInfo.getKeepDeletedCells();
+    this.seePastDeleteMarkers = scanInfo.getKeepDeletedCells() != 
KeepDeletedCells.FALSE && isUserScan;
+    ScanInfo scanInfo = new ScanInfo(null, 0, 1, HConstants.LATEST_TIMESTAMP, 
KeepDeletedCells.FALSE,
+      
family.setKeepDeletedCells(org.apache.hadoop.hbase.KeepDeletedCells.valueOf(arg.delete(org.apache.hadoop.hbase.HColumnDescriptor::KEEP_DELETED_CELLS).to_s.upcase))
 if arg.include?(org.apache.hadoop.hbase.HColumnDescriptor::KEEP_DELETED_CELLS)

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/patchReleaseAuditWarnings.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//artifact/patchprocess/checkstyle-aggregate.html

                Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11555//console

This message is automatically generated.

> KEEP_DELETED_CELLS considered harmful?
> --------------------------------------
>
>                 Key: HBASE-12363
>                 URL: https://issues.apache.org/jira/browse/HBASE-12363
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>              Labels: Phoenix
>         Attachments: 12363-master.txt, 12363-test.txt
>
>
> Brainstorming...
> This morning in the train (of all places) I realized a fundamental issue in 
> how KEEP_DELETED_CELLS is implemented.
> The problem is around knowing when it is safe to remove a delete marker (we 
> cannot remove it unless all cells affected by it are remove otherwise).
> This was particularly hard for family marker, since they sort before all 
> cells of a row, and hence scanning forward through an HFile you cannot know 
> whether the family markers are still needed until at least the entire row is 
> scanned.
> My solution was to keep the TS of the oldest put in any given HFile, and only 
> remove delete markers older than that TS.
> That sounds good on the face of it... But now imagine you wrote a version of 
> ROW 1 and then never update it again. Then later you write a billion other 
> rows and delete them all. Since the TS of the cells in ROW 1 is older than 
> all the delete markers for the other billion rows, these will never be 
> collected... At least for the region that hosts ROW 1 after a major 
> compaction.
> Note, in a sense that is what HBase is supposed to do when keeping deleted 
> cells: Keep them until they would be removed by some other means (for example 
> TTL, or MAX_VERSION when new versions are inserted).
> The specific problem here is that even as all KVs affected by a delete marker 
> are expired this way the marker would not be removed if there just one older 
> KV in the HStore.
> I don't see a good way out of this. In parent I outlined these four solutions:
> So there are three options I think:
> # Only allow the new flag set on CFs with TTL set. MIN_VERSIONS would not 
> apply to deleted rows or delete marker rows (wouldn't know how long to keep 
> family deletes in that case). (MAX)VERSIONS would still be enforced on all 
> rows types except for family delete markers.
> # Translate family delete markers to column delete marker at (major) 
> compaction time.
> # Change HFileWriterV* to keep track of the earliest put TS in a store and 
> write it to the file metadata. Use that use expire delete marker that are 
> older and hence can't affect any puts in the file.
> # Have Store.java keep track of the earliest put in internalFlushCache and 
> compactStore and then append it to the file metadata. That way HFileWriterV* 
> would not need to know about KVs.
> And I implemented #4.
> I'd love to get input on ideas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to