Even if the files aren't closed properly, the fact that you are appending
should persist them.

Are you using a version of Hadoop that supports sync?

Do you have logs that show the issue where the logs were moved but not
written?

Thx,

J-D

On Tue, Oct 18, 2011 at 7:40 AM, Mingjian Deng <[email protected]> wrote:

> Hi:
>    There is a case cause data loss in our cluster. We block in splitLog
> because some error in our hdfs and we kill master. Some hlog files were
> moved from .logs to .oldlogs before them were wrote to .recovered.edits. So
> rs couldn't replay these files.
>    In HLogSplitter.java, we found:
>    ...
>    archiveLogs(srcDir, corruptedLogs, processedLogs, oldLogDir, fs, conf);
>    } finally {
>      LOG.info("Finishing writing output logs and closing down.");
>      splits = outputSink.finishWritingAndClose();
>    }
>    Why archiveLogs before outputSink.finishWritingAndClose()? Did these
> hlog files mv to .oldlogs and couldn't be split next startup if write
> threads failed but archiveLog success?
>

Reply via email to