I mean the error "ERROR: invalid ledger id 56" is raised due to using a wrong ledger id formatter. I was suggesting you rerunning the command to collect more information so that we can debug.
- Sijie On Tue, Dec 3, 2019 at 12:17 AM Sharda, Ravi <ravi.sha...@dell.com> wrote: > Did you mean we should run this on a running environment to recover from > the failure? > > “bin/bookkeeper shell ledger -ledgeridformat long -m [ledger-id]" > ------------------------------ > *From:* Sijie Guo <guosi...@gmail.com> > *Sent:* Tuesday, December 3, 2019 1:10 PM > *To:* Sharda, Ravi <ravi.sha...@dell.com> > *Cc:* Enrico Olivelli <eolive...@gmail.com>; user < > user@bookkeeper.apache.org>; Flavio Junqueira <f...@apache.org> > *Subject:* Re: Bookeeper exception on pods restart > > > [EXTERNAL EMAIL] > I think 4.7.2 is using UUID as the ledger id formatter by default (it was > a mistake, and reverted in its subsequent releases). > > So you might have to run “bin/bookkeeper shell ledger -ledgeridformat long > -m [ledger-id]". > > Can you rerun this command again? > > --- > > > I saw this occurring for several ledgers in this environment. > > The IOException might be related to disk issues. Although I don't have > enough information to tell. > > Thanks, > Sijie > > On Mon, Dec 2, 2019 at 7:24 AM Sharda, Ravi <ravi.sha...@dell.com> wrote: > > > Enrico, > > I saw this occurring for several ledgers in this environment. > > ----------- > > ERROR - [BookieReadThreadPool-OrderedExecutor-3-0:ReadEntryProcessorV3@235] > - IOException while reading entry: 5 from ledger 1243 > > [BookieReadThreadPool-OrderedExecutor-7-0:ReadEntryProcessorV3@235] - > IOException while reading entry: 15 from ledger 1239 > > ERROR - [BookieReadThreadPool-OrderedExecutor-0-0:ReadEntryProcessorV3@235] > - IOException while reading entry: 102 from ledger 64 > > ERROR - [BookieReadThreadPool-OrderedExecutor-0-0:ReadEntryProcessorV3@235] > - IOException while reading entry: 7 from ledger 728 > ------------------------------ > *From:* Enrico Olivelli <eolive...@gmail.com> > *Sent:* Monday, December 2, 2019 7:01 PM > *To:* user <user@bookkeeper.apache.org> > *Cc:* Sijie Guo <guosi...@gmail.com>; Flavio Junqueira <f...@apache.org> > *Subject:* Re: Bookeeper exception on pods restart > > > [EXTERNAL EMAIL] > > > sh-4.2# ./bookkeeper shell ledger -m 56 > ERROR: invalid ledger id 56 > ledger: Dump ledger index entries into readable format. > usage: ledger [-m] <ledger_id> > -m,--meta Print meta information > > Does it work for other ledgers ? > > Enrico > > > > Il giorno lun 2 dic 2019 alle ore 10:06 Sharda, Ravi <ravi.sha...@dell.com> > ha scritto: > > Hello Sijie, > > Any luck with this? Please let us know what could be going wrong. > > Thanks & best regards, > Ravi > ------------------------------ > *From:* Sharda, Ravi <ravi.sha...@dell.com> > *Sent:* Friday, November 29, 2019 3:28 PM > *To:* Sijie Guo <guosi...@gmail.com> > *Cc:* user <user@bookkeeper.apache.org>; Flavio Junqueira <f...@apache.org> > *Subject:* Re: Bookeeper exception on pods restart > > Thanks. Here's the output of the command: > > sh-4.2# ./bookkeeper shell ledger -m 56 > ERROR: invalid ledger id 56 > ledger: Dump ledger index entries into readable format. > usage: ledger [-m] <ledger_id> > -m,--meta Print meta information > ------------------------------ > *From:* Sijie Guo <guosi...@gmail.com> > *Sent:* Friday, November 29, 2019 3:15 PM > *To:* Sharda, Ravi <ravi.sha...@dell.com> > *Cc:* user <user@bookkeeper.apache.org>; Flavio Junqueira <f...@apache.org> > *Subject:* Re: Bookeeper exception on pods restart > > > [EXTERNAL EMAIL] > Sorry, my bad. The command for reading ledger index should be "bookkeeper > shell ledger". > > From the `ls` output, I didn't find entry 1.log under ledgers directory. > So I guess the log file doesn't exist. If you can provide the output of > `bookkeeper shell ledger`, we can take a look at the index file to > understand more. > > On Fri, Nov 29, 2019 at 1:20 AM Sharda, Ravi <ravi.sha...@dell.com> wrote: > > For the following error, > > OrderedExecutor-0-0:ReadEntryProcessorV3@235] - IOException while reading > entry: 25 from ledger 56 > > java.io.FileNotFoundException: No file for log 1 for 56 with location > 4744138143 > > at org.apache.bookkeeper.bookie.EntryLogger.findFile(EntryLogger.java:1165) > > at > org.apache.bookkeeper.bookie.EntryLogger.getChannelForLogId(EntryLogger.java:1100) > > at > org.apache.bookkeeper.bookie.EntryLogger.internalReadEntry(EntryLogger.java:1002) > > at > org.apache.bookkeeper.bookie.EntryLogger.readEntry(EntryLogger.java:1051) > > at > org.apache.bookkeeper.bookie.InterleavedLedgerStorage.getEntry(InterleavedLedgerStorage.java:305) > > at > org.apache.bookkeeper.bookie.SortedLedgerStorage.getEntry(SortedLedgerStorage.java:153) > > at org.apache.bookkeeper.bookie.L > ----------- > sh-4.2# ./bookkeeper shell readledger --ledgerid 56 > > ERROR: invalid value for option ledgerid : 56 > Must specify a ledger id > > ----------- > I didn't know how to check that the log file exists. Attaching the output > of "ls -R -L", instead. > > ------------------------------ > *From:* Sijie Guo <guosi...@gmail.com> > *Sent:* Friday, November 29, 2019 2:31 PM > *To:* Sharda, Ravi <ravi.sha...@dell.com> > *Cc:* user <user@bookkeeper.apache.org>; Flavio Junqueira <f...@apache.org> > *Subject:* Re: Bookeeper exception on pods restart > > > [EXTERNAL EMAIL] > If it is a permanent error, > > - check if the log file (indicated in the error message) exists or not. > - use `bookkeeper shell readledger` to dump the index of the given ledger. > - see if the index points to the right entry log file or not > > - Sijie > > On Fri, Nov 29, 2019 at 12:46 AM Sharda, Ravi <ravi.sha...@dell.com> > wrote: > > The latest instance we have seen is a permanent error. The bookies haven't > recovered in the environment (last 2 days). In some previous instances, > developers had reported that the bookies had recovered, but it is also > possible that the error was slightly different from what we are seeing now. > > Thanks & best regards, > Ravi > ------------------------------ > *From:* Sijie Guo <guosi...@gmail.com> > *Sent:* Friday, November 29, 2019 2:10 PM > *To:* user <user@bookkeeper.apache.org> > *Cc:* Flavio Junqueira <f...@apache.org>; Sharda, Ravi < > ravi.sha...@dell.com> > *Subject:* Re: Bookeeper exception on pods restart > > > [EXTERNAL EMAIL] > Sorry for jumping into the discussion. But the error message indicates > that the entry log file 1 is not found. > It seems to me that entry log file was removed but the entry index still > points to the old location. Is this error transient error or a permanent > error? > > - Sijie > > > > On Fri, Nov 29, 2019 at 12:11 AM <prajakta.belgu...@dell.com> wrote: > > + Ravi, who will be looking into this …. > > > > *From:* Enrico Olivelli - Diennea <enrico.olive...@diennea.com> > *Sent:* Thursday, November 28, 2019 7:00 PM > *To:* user@bookkeeper.apache.org > *Cc:* f...@apache.org > *Subject:* Re: Bookeeper exception on pods restart > > > > [EXTERNAL EMAIL] > > From the error it looks like one client is trying to read an entry from > the Bookie but the entry is not there. > > I see two reasons: > > 1) The write never reached the bookie > > 2) The bookie is missing some file > > > > For 1) > > Do you have logs on the writer ? something that could tell us that a write > did not succeed ? > > How old is supposed to be the entry ? > > Do you have logs on the reader that is trying to read the entry ? > > > > > > For 2) > > Do you have other errors in the logs about failed writes or whatever ? > > If you were on 4.9 we could use the ‘localconsistency checker’ and check > for inconsistency on the bookie, it scans the bookie looking for every > entry that should reside on the bookie itsself. > > If you were writing your ledgers with writequorum >= 2 maybe you can > recover your data. > > > > > > In order to debug the problem we should compare the logs of: > > - The bookie > - The writer > - The reader > > > > > > Enrico > > > > > > Il giorno 28/11/19, 13:59 "prajakta.belgu...@dell.com" < > prajakta.belgu...@dell.com> ha scritto: > > > > I understand that auto recovery would replicate data for under replicated > ledgers. > > But it is scheduled to run only once in a while and may not have run > before a reader tries to read this data from a certain bookie. > > > > Generally what does below exception indicate about the state of BK? > > Does it indicate that the entry is missing on the specific bookie and so > we don’t find it? > > Or that something in the ledger metadata or ledgers could have been > corrupted?? > > > > Found the same issue with another product where you seem to have provided > a custom fix: > > https://github.com/diennea/herddb/issues/194 > > > > All in all want to understand if this can be the result of BK > misconfiguration or is just a temporary unavailability problem that will > resolve itself when auto-replication runs?? > > > > -Thanks, > > Prajakta > > > > *From:* Enrico Olivelli - Diennea <enrico.olive...@diennea.com> > *Sent:* Thursday, November 28, 2019 6:11 PM > *To:* user@bookkeeper.apache.org > *Cc:* f...@apache.org > *Subject:* Re: Bookeeper exception on pods restart > > > > [EXTERNAL EMAIL] > > I don’t think there is a good value. > > You can use WriteQuorumSize = AckQuorumSize, this way you will see an > error on the writing client in case of write failure to any of the bookies > > > > Usually you are enabling the Autorecovery feature to fill in the gaps of > underreplicated ledgers: > > http://bookkeeper.apache.org/docs/4.10.0/admin/autorecovery/ > > > > > > Hope that helps > > Enrico > > > > Il giorno 28/11/19, 13:30 "prajakta.belgu...@dell.com" < > prajakta.belgu...@dell.com> ha scritto: > > > > What EnsembleSize, WriteQuorumSize and AckQuorumSize would you recommend, > so we never see this? > > What other ledger creation parameters do you need information about? > > > > -Thanks, > > Prajakta > > *From:* Enrico Olivelli - Diennea <enrico.olive...@diennea.com> > *Sent:* Thursday, November 28, 2019 5:19 PM > *To:* user@bookkeeper.apache.org > *Cc:* f...@apache.org > *Subject:* Re: Bookeeper exception on pods restart > > > > [EXTERNAL EMAIL] > > Hi Prajakta, > > What ledger creation parameters are you using ? Ensamble size, Write > quorum size, Ack quorum size ? > > If ackQuorumSize < WriteQuorumSize it is possible that a write to the > bookie failed and even if the entry is supposed to be on the bookie it > never reached it but the overall single write succeeded because a > writequorum of bookies acknowledged the write. > > > > Enrico > > > > Il giorno 28/11/19, 12:44 "prajakta.belgu...@dell.com" < > prajakta.belgu...@dell.com> ha scritto: > > > > Hello Team, > > > > We have a question about an issue we are running into with Bookeeper. > > We use bookkeeper version 4.7.3. > > > > This issue occurs occasionally when Bookkeeper servers are restarted. > > We see the following error in the logs for some time, which blocks > Pravega's operations for the same duration. Not knowing the internals of > Bookeeper, but just based on the exception alone, it seems like Bookeeper > might not be locate the files temporarily. What could be causing this? > > > > 2019-11-28 03:52:26,491 - ERROR - > [BookieReadThreadPool-OrderedExecutor-0-0:ReadEntryProcessorV3@235] - > IOException while reading entry: 25 from ledger 56 > java.io > <https://slack-redir.net/link?url=http%3A%2F%2Fjava.io>.FileNotFoundException: > No file for log 1 for 56 with location 4744138143 > at > org.apache.bookkeeper.bookie.EntryLogger.findFile(EntryLogger.java:1165) > at > org.apache.bookkeeper.bookie.EntryLogger.getChannelForLogId(EntryLogger.java:1100) > at > org.apache.bookkeeper.bookie.EntryLogger.internalReadEntry(EntryLogger.java:1002) > at > org.apache.bookkeeper.bookie.EntryLogger.readEntry(EntryLogger.java:1051) > at > org.apache.bookkeeper.bookie.InterleavedLedgerStorage.getEntry(InterleavedLedgerStorage.java:305) > at > org.apache.bookkeeper.bookie.SortedLedgerStorage.getEntry(SortedLedgerStorage.java:153) > at > org.apache.bookkeeper.bookie.LedgerDescriptorImpl.readEntry(LedgerDescriptorImpl.java:153) > at org.apache.bookkeeper.bookie.Bookie.readEntry(Bookie.java:1305) > at > org.apache.bookkeeper.proto.ReadEntryProcessorV3.readEntry(ReadEntryProcessorV3.java:175) > at > org.apache.bookkeeper.proto.ReadEntryProcessorV3.readEntry(ReadEntryProcessorV3.java:155) > at > org.apache.bookkeeper.proto.ReadEntryProcessorV3.getReadResponse(ReadEntryProcessorV3.java:218) > at > org.apache.bookkeeper.proto.ReadEntryProcessorV3.executeOp(ReadEntryProcessorV3.java:264) > at > org.apache.bookkeeper.proto.ReadEntryProcessorV3.safeRun(ReadEntryProcessorV3.java:260) > at > org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > > > -Thanks, > > Prajakta > > > > > ------------------------------ > > > CONFIDENTIALITY & PRIVACY NOTICE > This e-mail (including any attachments) is strictly confidential and may > also contain privileged information. If you are not the intended recipient > you are not authorised to read, print, save, process or disclose this > message. If you have received this message by mistake, please inform the > sender immediately and destroy this e-mail, its attachments and any copies. > Any use, distribution, reproduction or disclosure by any person other than > the intended recipient is strictly prohibited and the person responsible > may incur in penalties. > The use of this e-mail is only for professional purposes; there is no > guarantee that the correspondence towards this e-mail will be read only by > the recipient, because, under certain circumstances, there may be a need to > access this email by third subjects belonging to the Company. > > > ------------------------------ > > > CONFIDENTIALITY & PRIVACY NOTICE > This e-mail (including any attachments) is strictly confidential and may > also contain privileged information. If you are not the intended recipient > you are not authorised to read, print, save, process or disclose this > message. If you have received this message by mistake, please inform the > sender immediately and destroy this e-mail, its attachments and any copies. > Any use, distribution, reproduction or disclosure by any person other than > the intended recipient is strictly prohibited and the person responsible > may incur in penalties. > The use of this e-mail is only for professional purposes; there is no > guarantee that the correspondence towards this e-mail will be read only by > the recipient, because, under certain circumstances, there may be a need to > access this email by third subjects belonging to the Company. > > > ------------------------------ > > > CONFIDENTIALITY & PRIVACY NOTICE > This e-mail (including any attachments) is strictly confidential and may > also contain privileged information. If you are not the intended recipient > you are not authorised to read, print, save, process or disclose this > message. If you have received this message by mistake, please inform the > sender immediately and destroy this e-mail, its attachments and any copies. > Any use, distribution, reproduction or disclosure by any person other than > the intended recipient is strictly prohibited and the person responsible > may incur in penalties. > The use of this e-mail is only for professional purposes; there is no > guarantee that the correspondence towards this e-mail will be read only by > the recipient, because, under certain circumstances, there may be a need to > access this email by third subjects belonging to the Company. > >