+ Ravi, who will be looking into this ….

From: Enrico Olivelli - Diennea <enrico.olive...@diennea.com>
Sent: Thursday, November 28, 2019 7:00 PM
To: user@bookkeeper.apache.org
Cc: f...@apache.org
Subject: Re: Bookeeper exception on pods restart


[EXTERNAL EMAIL]
From the error it looks like one client is trying to read an entry from the 
Bookie but the entry is not there.
I see two reasons:
1) The write never reached the bookie
2) The bookie is missing some file

For 1)
Do you have logs on the writer ? something that could tell us that a write did 
not succeed ?
How old is supposed to be the entry ?
Do you have logs on the reader that is trying to read the entry ?


For 2)
Do you have other errors in the logs about failed writes or whatever ?
If you were on 4.9 we could use the ‘localconsistency checker’ and check for 
inconsistency on the bookie, it scans the bookie looking for every entry that 
should reside on the bookie itsself.
If you were writing your ledgers with writequorum >= 2 maybe you can recover 
your data.


In order to debug the problem we should compare the logs of:

  *   The bookie
  *   The writer
  *   The reader


Enrico


Il giorno 28/11/19, 13:59 
"prajakta.belgu...@dell.com<mailto:prajakta.belgu...@dell.com>" 
<prajakta.belgu...@dell.com<mailto:prajakta.belgu...@dell.com>> ha scritto:

I understand that auto recovery would replicate data for under replicated 
ledgers.
But it is scheduled to run only once in a while and may not have run before a 
reader tries to read this data from a certain bookie.

Generally what does below exception indicate about the state of BK?
Does it indicate that the entry is missing on the specific bookie and so we 
don’t find it?
Or that something in the ledger metadata or ledgers could have been corrupted??

Found the same issue with another product where you seem to have provided a 
custom fix:
https://github.com/diennea/herddb/issues/194

All in all want to understand if this can be the result of BK misconfiguration 
or is just a temporary unavailability problem that will resolve itself when 
auto-replication runs??

-Thanks,
Prajakta

From: Enrico Olivelli - Diennea 
<enrico.olive...@diennea.com<mailto:enrico.olive...@diennea.com>>
Sent: Thursday, November 28, 2019 6:11 PM
To: user@bookkeeper.apache.org<mailto:user@bookkeeper.apache.org>
Cc: f...@apache.org<mailto:f...@apache.org>
Subject: Re: Bookeeper exception on pods restart


[EXTERNAL EMAIL]
I don’t think there is a good value.
You can use WriteQuorumSize = AckQuorumSize, this way you will see an error on 
the writing client in case of write failure to any of the bookies

Usually you are enabling the Autorecovery feature to fill in the gaps of 
underreplicated ledgers:
http://bookkeeper.apache.org/docs/4.10.0/admin/autorecovery/


Hope that helps
Enrico

Il giorno 28/11/19, 13:30 
"prajakta.belgu...@dell.com<mailto:prajakta.belgu...@dell.com>" 
<prajakta.belgu...@dell.com<mailto:prajakta.belgu...@dell.com>> ha scritto:

What EnsembleSize, WriteQuorumSize and AckQuorumSize would you recommend, so we 
never see this?
What other ledger creation parameters do you need information about?

-Thanks,
Prajakta
From: Enrico Olivelli - Diennea 
<enrico.olive...@diennea.com<mailto:enrico.olive...@diennea.com>>
Sent: Thursday, November 28, 2019 5:19 PM
To: user@bookkeeper.apache.org<mailto:user@bookkeeper.apache.org>
Cc: f...@apache.org<mailto:f...@apache.org>
Subject: Re: Bookeeper exception on pods restart


[EXTERNAL EMAIL]
Hi Prajakta,
What ledger creation parameters are you using ?  Ensamble size, Write quorum 
size, Ack quorum size ?
If ackQuorumSize < WriteQuorumSize it is possible that a write to the bookie 
failed and even if the entry is supposed to be on the bookie it never reached 
it but the overall single write succeeded because a writequorum of bookies 
acknowledged the write.

Enrico

Il giorno 28/11/19, 12:44 
"prajakta.belgu...@dell.com<mailto:prajakta.belgu...@dell.com>" 
<prajakta.belgu...@dell.com<mailto:prajakta.belgu...@dell.com>> ha scritto:

Hello Team,

We have a question about an issue we are running into with Bookeeper.
We use bookkeeper version 4.7.3.

This issue occurs occasionally when Bookkeeper servers are restarted.
We see the following error in the logs for some time, which blocks Pravega's 
operations for the same duration. Not knowing the internals of Bookeeper, but 
just based on the exception alone, it seems like Bookeeper might not be locate 
the files temporarily. What could be causing this?


2019-11-28 03:52:26,491 - ERROR - 
[BookieReadThreadPool-OrderedExecutor-0-0:ReadEntryProcessorV3@235] - 
IOException while reading entry: 25 from ledger 56
java.io<https://slack-redir.net/link?url=http%3A%2F%2Fjava.io>.FileNotFoundException:
 No file for log 1 for 56 with location 4744138143
        at 
org.apache.bookkeeper.bookie.EntryLogger.findFile(EntryLogger.java:1165)
        at 
org.apache.bookkeeper.bookie.EntryLogger.getChannelForLogId(EntryLogger.java:1100)
        at 
org.apache.bookkeeper.bookie.EntryLogger.internalReadEntry(EntryLogger.java:1002)
        at 
org.apache.bookkeeper.bookie.EntryLogger.readEntry(EntryLogger.java:1051)
        at 
org.apache.bookkeeper.bookie.InterleavedLedgerStorage.getEntry(InterleavedLedgerStorage.java:305)
        at 
org.apache.bookkeeper.bookie.SortedLedgerStorage.getEntry(SortedLedgerStorage.java:153)
        at 
org.apache.bookkeeper.bookie.LedgerDescriptorImpl.readEntry(LedgerDescriptorImpl.java:153)
        at org.apache.bookkeeper.bookie.Bookie.readEntry(Bookie.java:1305)
        at 
org.apache.bookkeeper.proto.ReadEntryProcessorV3.readEntry(ReadEntryProcessorV3.java:175)
        at 
org.apache.bookkeeper.proto.ReadEntryProcessorV3.readEntry(ReadEntryProcessorV3.java:155)
        at 
org.apache.bookkeeper.proto.ReadEntryProcessorV3.getReadResponse(ReadEntryProcessorV3.java:218)
        at 
org.apache.bookkeeper.proto.ReadEntryProcessorV3.executeOp(ReadEntryProcessorV3.java:264)
        at 
org.apache.bookkeeper.proto.ReadEntryProcessorV3.safeRun(ReadEntryProcessorV3.java:260)
        at 
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

-Thanks,
Prajakta


________________________________

CONFIDENTIALITY & PRIVACY NOTICE
This e-mail (including any attachments) is strictly confidential and may also 
contain privileged information. If you are not the intended recipient you are 
not authorised to read, print, save, process or disclose this message. If you 
have received this message by mistake, please inform the sender immediately and 
destroy this e-mail, its attachments and any copies. Any use, distribution, 
reproduction or disclosure by any person other than the intended recipient is 
strictly prohibited and the person responsible may incur in penalties.
The use of this e-mail is only for professional purposes; there is no guarantee 
that the correspondence towards this e-mail will be read only by the recipient, 
because, under certain circumstances, there may be a need to access this email 
by third subjects belonging to the Company.

________________________________

CONFIDENTIALITY & PRIVACY NOTICE
This e-mail (including any attachments) is strictly confidential and may also 
contain privileged information. If you are not the intended recipient you are 
not authorised to read, print, save, process or disclose this message. If you 
have received this message by mistake, please inform the sender immediately and 
destroy this e-mail, its attachments and any copies. Any use, distribution, 
reproduction or disclosure by any person other than the intended recipient is 
strictly prohibited and the person responsible may incur in penalties.
The use of this e-mail is only for professional purposes; there is no guarantee 
that the correspondence towards this e-mail will be read only by the recipient, 
because, under certain circumstances, there may be a need to access this email 
by third subjects belonging to the Company.

________________________________

CONFIDENTIALITY & PRIVACY NOTICE
This e-mail (including any attachments) is strictly confidential and may also 
contain privileged information. If you are not the intended recipient you are 
not authorised to read, print, save, process or disclose this message. If you 
have received this message by mistake, please inform the sender immediately and 
destroy this e-mail, its attachments and any copies. Any use, distribution, 
reproduction or disclosure by any person other than the intended recipient is 
strictly prohibited and the person responsible may incur in penalties.
The use of this e-mail is only for professional purposes; there is no guarantee 
that the correspondence towards this e-mail will be read only by the recipient, 
because, under certain circumstances, there may be a need to access this email 
by third subjects belonging to the Company.

Reply via email to