[ 
https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683438#action_12683438
 ] 

Jonathan Gray commented on HBASE-1249:
--------------------------------------

Erik:
{quote}
DeleteFamily in MemCache, should probably have a different ts, to belong there, 
same thing with stuff in the storefiles. 
{quote}
Why should DeleteFamily have different timestamps?  In the Memcache, it is a 
DeleteFamily for ts <= 9.  This request can come in at any time so could be in 
the Memcache.  Remember, deletes in the current memcache/storefile relate only 
to older ones (smaller timestamps).
{quote}
For the GetColumns(RowA, Fam, [ColA, ColB], ts(20, 0)), I would think that that 
query would mean give me 1 version of ColA and ColB from the timeperiod 20 - 0.
{quote}
I meant all versions.  Will update and fix that.  You're right, default should 
probably always be 1.
{quote}
And for GetFamily(RowA, Fam, num=1), Couldn't you early out as soon as you 
encounter the DeleteFamily in MemCache?
{quote}
I don't believe so.  You encounter the DeleteFamily(ts <= 9), but the ID of the 
next StoreFile is 21, so you don't know anything about what's in there.  You 
are actually done at this point, for this particular query, but there is no way 
to know that there are not versions of other columns besides colA and colB in 
StoreFile1, or even StoreFile2 (id=11, so could also contain undeleted columns 
for the resultset).  For example, a (RowA, Put, Fam, ColC, 19, M) in StoreFile1 
would be part of the resultset if it existed.

Ryan:
{quote}
(RowA, Delete, Fam, ColA, 6, X)
Is the 'X' a typo? This should be a */no value since deletes dont actually 
carry any value info - the info is in the key itself.
{quote}
Yes, that is a typo.  If you look at the latest Example pdf, the StoreFiles 
should be accurate.
{quote}
It sounds like we lose the ability to version deletes if the value is in 
memcache
{quote}
I'm not sure I totally understand what you're saying about the removal from 
memcache affecting that.  To expand on your example...

We do a put of KeyValue:  (rowA, fam, put, colA, ts=now=100, X)
Then we do a delete of KeyValue, either Delete or DeleteColumn:  
DeleteColumn(rowA, fam, deletecolumn, colA, ts<=#)
Delete(rowA, fam, delete, colA, ts=#)

The timestamp that the delete goes in with is how you version the delete.  
Delete is for the lowest unit, single row, single family, single column, single 
version.  DeleteColumn is for a range of versions of a column, my current 
thought was <= the specified stamp.  If you don't specify one, then you will do 
now and that would delete all versions.  Does that make more sense?  I wasn't 
really clear about the 3 different delete types.

> Rearchitecting of server, client, API, key format, etc for 0.20
> ---------------------------------------------------------------
>
>                 Key: HBASE-1249
>                 URL: https://issues.apache.org/jira/browse/HBASE-1249
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1249-Example-v1.pdf, HBASE-1249-GetQuery-v1.pdf, 
> HBASE-1249-GetQuery-v2.pdf, HBASE-1249-StoreFile-v1.pdf
>
>
> To discuss all the new and potential issues coming out of the change in key 
> format (HBASE-1234): zero-copy reads, client binary protocol, update of API 
> (HBASE-880), server optimizations, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to