Adar Dembo created KUDU-1943:
--------------------------------

             Summary: log containers should be reusables without first closing 
in-flight writable blocks
                 Key: KUDU-1943
                 URL: https://issues.apache.org/jira/browse/KUDU-1943
             Project: Kudu
          Issue Type: Bug
          Components: fs
    Affects Versions: 1.3.0
            Reporter: Adar Dembo


The log block manager has had a longstanding issue wherein a container can only 
be used by a block once the outstanding writable block has been closed. Thing 
is, we like to delay the close (and sync) of blocks until the very end of a 
Kudu flush/compact operation, so as to maximize the amount of time that the 
kernel has to asynchronously flush dirty pages out to disk. As a result, the 
LBM can easily generate a thousand containers after flushing a very modest 
tablet of ~30 columns. To be precise, the number of containers will be equal to 
the flush threshold (1 GB by default) divided by the rowset size (32 MB by 
default) multiplied by the number of columns in the tablet. Coupled with the 
LBM's default preallocation buffer size (32 MB), a single tablet flush can 
result in the tserver's space consumption skyrocketing to 32 GB.

In and of itself this isn't fatal; the tserver will make use of this space over 
time. But it's a pretty bad first impression for a novice who is trying to 
calculate just how much disk space Kudu uses, and it means Kudu's disk space 
consumption is very "bursty" instead of linear.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to