[
https://issues.apache.org/jira/browse/KUDU-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281697#comment-15281697
]
Todd Lipcon commented on KUDU-678:
----------------------------------
Here's the little python script I used for the above analysis:
{code}
#!/usr/bin/env python
import subprocess
import sys
import re
META_RE = re.compile(r'id: (\d+).*op_type: (\S+)')
LENGTH_RE = re.compile(r'length: (\d+)')
def parse_metaline(l):
m = META_RE.search(l)
l_m = LENGTH_RE.search(l)
if m:
if l_m:
len = int(l_m.group(1))
else:
len = 0
return (m.group(2), m.group(1), len)
return None
def read_metadata():
p = subprocess.Popen(["kudu-pbc-dump", "--oneline", sys.argv[1]],
stdout=subprocess.PIPE)
stdout, stderr = p.communicate()
lines = stdout.split("\n")
return [m for m in [parse_metaline(l) for l in lines] if m]
meta = read_metadata()
blocks = {}
for (op, id, len) in meta:
if op == 'CREATE':
blocks[id] = len
elif op == 'DELETE':
del blocks[id]
for id, len in blocks.iteritems():
print id, len
{code}
> An empty delta block is orphaned when flushing or compacting with no edits
> --------------------------------------------------------------------------
>
> Key: KUDU-678
> URL: https://issues.apache.org/jira/browse/KUDU-678
> Project: Kudu
> Issue Type: Bug
> Components: tablet
> Affects Versions: Private Beta
> Reporter: Todd Lipcon
> Priority: Critical
>
> Currently when the DRS writer writes an empty undo file, it doesn't add it to
> the RowSetMetadata. But, the block itself is still closed and left around in
> the BlockManager, causing a tiny data leak. These files are going to be
> small, so it's not a big deal, but we ought to fix it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)