[ 
https://issues.apache.org/jira/browse/HBASE-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13597499#comment-13597499
 ] 

Sergey Shelukhin commented on HBASE-8034:
-----------------------------------------

bq. OutputStream will always implement getPos?
We rely on it in a few places in HFileWriterV2, so I would say yes.


bq. You need to change this comment so that it says its an estimate and say how 
you came by the estimate – in other words, this will be definitive doc on this 
new metadata:
bq. Can you clarify what file versions are considered 'old files' ?
Done, on the method.

bq. Would it make more sense to expose the number of KeyValues in the HFile?
That is an interesting question. For the purposes of compaction we care more 
about physical size being similar.
For the purposes of reads it's unclear, but probably key values. May be an 
improvement JIRA (including for default compaction algo)

bq. This strikes me as flakey. Will there be another thread writing to the 
OutputStream when this method is invoked? Should it be synchronized?
Probably not. Do you mean background writing inside the object or write calls?
We don't control the implementation for the former (it's hadoop one)... For the 
latter, similarly to HFileWriterV2, we rely on calling this method when we know 
we are not writing. That could be broken by changes, but adding sync to file 
writing for this would seem to be an overkill.
                
> record on-disk data size for store file and make it available during writing
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-8034
>                 URL: https://issues.apache.org/jira/browse/HBASE-8034
>             Project: HBase
>          Issue Type: Task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>            Priority: Minor
>         Attachments: HBASE-8034-v0.patch
>
>
> To better estimate the size of data in the file, and to be able to split 
> files intelligently during any multi-file compactor like stripe or level.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to