[
https://issues.apache.org/jira/browse/KUDU-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15795631#comment-15795631
]
Juan Yu commented on KUDU-1768:
-------------------------------
BTW, It can also crash.
{code}
Log file created at: 2016/12/30 20:16:58
Running on machine: impala4135-2.vpc.cloudera.com
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F1230 20:16:58.748709 30089 tablet_peer_mm_ops.cc:128] Check failed: _s.ok()
FlushMRS failed on e8b5861a24fb4ada9bf2e219b87b73ef: IO error: Failed to open
DiskRowSet for flush: Unable to Start() writer for column double_col181[double
NULLABLE]: Couldn't write header:
/dataroot/dataroot/kudu/data/data/b0096794642b40c1b0ff7587d5082b7c.data: No
space left on device (error 28)
{code}
> Tablet server cannot recover after run out of disk space
> --------------------------------------------------------
>
> Key: KUDU-1768
> URL: https://issues.apache.org/jira/browse/KUDU-1768
> Project: Kudu
> Issue Type: Bug
> Affects Versions: 1.0.0
> Reporter: Juan Yu
> Attachments: tserver.log.tar.gz
>
>
> My node has small disk. after upsert into a large table several times, it ran
> out of disk space. I dropped the table. all nodes dropped the tablet
> successfully except one. it stuck in a bad state and tablet data are not
> removed even after restart.
> For dropped table, it shows
> {code}
> tmp_store_sales d88f8a7eba974cba8d9c547f69168c6b hash buckets:
> (0) FAILED (TABLET_DATA_DELETED): Invalid argument: Unable to delete
> on-disk data from tablet d88f8a7eba974cba8d9c547f69168c6b: The metadata for
> tablet d88f8a7eba974cba8d9c547f69168c6b still references orphaned blocks.
> Call DeleteTabletData() first Invalid argument: Unable to
> delete on-disk data from tablet d88f8a7eba974cba8d9c547f69168c6b: The
> metadata for tablet d88f8a7eba974cba8d9c547f69168c6b still references
> orphaned blocks. Call DeleteTabletData() first
> {code}
> for other tables, it shows
> {code}
> kudu_store_sales f539ce77657e431aa5ce4ab7f7375f84 hash buckets:
> (0, 7) FAILED (TABLET_DATA_READY): IO error: Failed log replay. Reason:
> Failed to open new log: Insufficient disk space to allocate 67108864 bytes
> under path
> /dataroot/dataroot/tserver/wal/wals/f539ce77657e431aa5ce4ab7f7375f84/.tmp.newsegmentFdOSTp
> (30613504 bytes free vs 0 bytes reserved) (error 28) IO
> error: Failed log replay. Reason: Failed to open new log: Insufficient disk
> space to allocate 67108864 bytes under path
> /dataroot/dataroot/tserver/wal/wals/f539ce77657e431aa5ce4ab7f7375f84/.tmp.newsegmentFdOSTp
> (30613504 bytes free vs 0 bytes reserved) (error 28)
> {code}
> attached tablet server log from this node.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)