cshannon opened a new issue, #3798:
URL: https://github.com/apache/accumulo/issues/3798

   While working on more `FencedFileSKVIterator` tests as part of #3766 I 
noticed that getFirstKey() and getLastKey() are not consistent with what is 
returned for fenced files. 
   
   Normally the first or last key in the RFile that is returned is the actual 
first or last key in the file so it includes the entire key (row, cf, cq, etc). 
However, for fenced files if there is a range set that is after first key or 
before last key then only a Row is returned. For example, if the start of the 
range that is set on the fenced reader is after the first key in the file then 
when getFirstKey() is called we just return the start of the Range which is 
only a row and of course not the actual key in the file.
   
   In order to return the actual start key for a fenced file we would need to 
seek to the first key in the range but this would be a performance hit as 
normally the first/last key are part of the index. It is also seems to be 
unnecessary because looking at the code base it appears that the only place 
these methods are actually used are places where we only care about the row and 
nothing else (really just splits).
   
   So we should make this consistent behavior. We could either create new 
methods called getFirstRow() and getLastRow() that just return the row part of 
the key or we could rename getFirstKey() and getLastKey() since they don't seem 
to be used anywhere that needs the full key.  If we created new methods then we 
would likely need to updated the fencing iterator to seek to find the real 
first and last key for the old methods and then update the split code to use 
getFirstRow() and getLastRow() instead. If we go with the rename option we need 
to double check and verify we don't actually need the full key because I only 
took a quick glance and could have missed something.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to