Thanks a lot for taking care of it. On Fri, Feb 6, 2015 at 3:03 PM, Sijie Guo (JIRA) <[email protected]> wrote:
> > [ > https://issues.apache.org/jira/browse/BOOKKEEPER-821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308731#comment-14308731 > ] > > Sijie Guo commented on BOOKKEEPER-821: > -------------------------------------- > > [~fpj] what is your take for this jira? If no objection, I would go ahead > to commit this. > > > Failing to write lastId to ledger directories should not fail startup of > bookies > > > -------------------------------------------------------------------------------- > > > > Key: BOOKKEEPER-821 > > URL: > https://issues.apache.org/jira/browse/BOOKKEEPER-821 > > Project: Bookkeeper > > Issue Type: Bug > > Components: bookkeeper-server > > Affects Versions: 4.3.0 > > Reporter: Jia Zhai > > Assignee: Jia Zhai > > Fix For: 4.3.1 > > > > Attachments: BOOKKEEPER-821_no-prefix.patch > > > > Original Estimate: 5h > > Remaining Estimate: 5h > > > > At the startup of bookie, the bookie constructor would call > setLastLogId() as this trace: > > Bookie() -> InterleavedLedgerStorage() -> EntryLogger() -> > Initialize() - >createNewLog() -> allocateNewLog() -> setLastLogId(). > > If "bw.write() and bw.flush()" in setLastLogId() failed, an IOException > would throw and not catch, and cause Bookie constructor fail. > > {code} > > private void setLastLogId(File dir, long logId) throws > IOException { > > FileOutputStream fos; > > fos = new FileOutputStream(new File(dir, "lastId")); > > BufferedWriter bw = new BufferedWriter(new > OutputStreamWriter(fos, UTF_8)); > > try { // < ==== original: if write fail in this try, > IOException thrown but not catch, and cause bookie startup fail. > > bw.write(Long.toHexString(logId) + "\n"); > > bw.flush(); > > } finally { > > try { > > bw.close(); > > } catch (IOException e) { > > LOG.error("Could not close lastId file in {}", > dir.getPath()); > > } > > } > > } > > {code} > > But failing setLastLogId() could be tolerated, and will not cause any > problem in next time Bookie startup. > > Next time, when calling EntryLogger constructor again, in > getLastLogId(), if read ledgerDir.lastId fail, it will walk through all log > files to get LastLogID; If reading an old ledgerDir.lastId, which caused by > last failure of setLastLogId(), and it happened to be the largest logId, > then allocateNewLog() will find the file already exist, and will allocate > newlogfile with bigger ID. > > {code} > > BufferedLogChannel allocateNewLog() throws IOException { > > List<File> list = ledgerDirsManager.getWritableLedgerDirs(); > > Collections.shuffle(list); > > // It would better not to overwrite existing entry log files > > File newLogFile = null; > > do { > > String logFileName = > Long.toHexString(++preallocatedLogId) + ".log"; > > for (File dir : list) { > > newLogFile = new File(dir, logFileName); > > currentDir = dir; > > if (newLogFile.exists()) { < === this will handle > last set fail issue, in which LastId update fail, and get a wrong > preallocatedLogId. > > LOG.warn("Found existed entry log " + newLogFile > > + " when trying to create it as a new > log."); > > newLogFile = null; > > break; > > } > > } > > } while (newLogFile == null); > > FileChannel channel = new RandomAccessFile(newLogFile, > "rw").getChannel(); > > BufferedLogChannel logChannel = new > BufferedLogChannel(channel, > > conf.getWriteBufferBytes(), > conf.getReadBufferBytes(), preallocatedLogId); > > logChannel.write((ByteBuffer) LOGFILE_HEADER.clear()); > > for (File f : list) { > > setLastLogId(f, preallocatedLogId); > > } > > LOG.info("Preallocated entry logger {}.", preallocatedLogId); > > return logChannel; > > } > > {code} > > > > -- > This message was sent by Atlassian JIRA > (v6.3.4#6332) >
