Juan Yu created KUDU-1768:
-----------------------------
Summary: Tablet server cannot recover after run out of disk space
Key: KUDU-1768
URL: https://issues.apache.org/jira/browse/KUDU-1768
Project: Kudu
Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Juan Yu
Attachments: tserver.log.tar.gz
My node has small disk. after upsert into a large table several times, it ran
out of disk space. I dropped the table. all nodes dropped the tablet
successfully except one. it stuck in a bad state and tablet data are not
removed even after restart.
For dropped table, it shows
{code}
tmp_store_sales d88f8a7eba974cba8d9c547f69168c6b hash buckets: (0)
FAILED (TABLET_DATA_DELETED): Invalid argument: Unable to delete on-disk data
from tablet d88f8a7eba974cba8d9c547f69168c6b: The metadata for tablet
d88f8a7eba974cba8d9c547f69168c6b still references orphaned blocks. Call
DeleteTabletData() first Invalid argument: Unable to delete
on-disk data from tablet d88f8a7eba974cba8d9c547f69168c6b: The metadata for
tablet d88f8a7eba974cba8d9c547f69168c6b still references orphaned blocks. Call
DeleteTabletData() first
{code}
for other tables, it shows
{code}
kudu_store_sales f539ce77657e431aa5ce4ab7f7375f84 hash buckets:
(0, 7) FAILED (TABLET_DATA_READY): IO error: Failed log replay. Reason:
Failed to open new log: Insufficient disk space to allocate 67108864 bytes
under path
/dataroot/dataroot/tserver/wal/wals/f539ce77657e431aa5ce4ab7f7375f84/.tmp.newsegmentFdOSTp
(30613504 bytes free vs 0 bytes reserved) (error 28) IO
error: Failed log replay. Reason: Failed to open new log: Insufficient disk
space to allocate 67108864 bytes under path
/dataroot/dataroot/tserver/wal/wals/f539ce77657e431aa5ce4ab7f7375f84/.tmp.newsegmentFdOSTp
(30613504 bytes free vs 0 bytes reserved) (error 28)
{code}
attached tablet server log from this node.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)