[ 
https://issues.apache.org/jira/browse/HBASE-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710181#action_12710181
 ] 

Erik Holstad commented on HBASE-1304:
-------------------------------------

@Ryan
The way I see it is that the fact that deletes only apply to earlier files is 
not something that is going to speed up the early out scenario for all cases, 
where it will help is when you have queries that don't need to touch files but 
only get data from memcache, since you don't need to process any deletes in 
memcache. The fact that deletes do, in the new implementation, only apply to 
older files is more like a bi product from the fact that deletes in memcache 
are immediately applied to the data in there.

If that is the right approach, that is a different story. The reason that I 
think that it makes sense comes from the fact that deletes take up a lot of 
resources and time when processing data, so I would like for them to be as 
efficient as possible. The best thing would be to apply them to the whole store 
as soon as they came in, but since that is not realistic we have to do 
something else.
So be deleting everything in memcache that is effected by the incoming  delete 
we save time and space, by having less data to process and less flushes calls 
leading to fewer compactions of any kind.

The above reasoning might not make sense in all cases, but for a majority I 
think it does.

When it comes down to minor compactions, not sure if you are worried about them 
taking longer time than before where we "just" merged the results. If that is 
the case, most of the work for that merge is to find out which KeyValue should 
be the next, actually deleting the entries effected by a delete wouldn't add 
that much overhead. 

What are your concerns when it comes to removing deleted KeyValues in a minor 
compaction, they are still going to be removed eventually and there is 
currently now way to undo your delete to get them back, so the way I see it 
they are just a burden for the system. What kinda of behaviour would you like 
to see?

Regards Erik

> New client server implementation of how gets and puts are handled. 
> -------------------------------------------------------------------
>
>                 Key: HBASE-1304
>                 URL: https://issues.apache.org/jira/browse/HBASE-1304
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Erik Holstad
>            Assignee: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: hbase-1304-v1.patch, HBASE-1304-v2.patch, 
> HBASE-1304-v3.patch, HBASE-1304-v4.patch, HBASE-1304-v5.patch, 
> HBASE-1304-v6.patch, HBASE-1304-v7.patch
>
>
> Creating an issue where the implementation of the new client and server will 
> go. Leaving HBASE-1249 as a discussion forum and will put code and patches 
> here.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to