Denovo1998 opened a new issue, #20414: URL: https://github.com/apache/pulsar/issues/20414
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Motivation #17221 describes an environment when multiple bookie copies are corrupted, or a Ledger has been deleted. The loss of schema ledger results in new producers and consumers not even being created and working properly. According to the solution of PR #18010, enable `autoSkipNonRecoverableData` and skip has gotten lost schema can lead to the schema information is not complete. And in the existing code, schema corruption will delete the metadata. https://github.com/apache/pulsar/blob/a953027aad38c9f54e952133949280ec2f4c04e8/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/schema/SchemaRegistryServiceImpl.java#L564-L570 If an error is not recoverable will be deleted, but PR #18010 and #19882 has been maked `NoSuchLedgerExistsOnMetadataServerException` also as recoverable exception. So we need a solution that does not just skip the schema with the missing ledger, but actually supplements the broken schema ledger. ### Solution A new method called `tryCompleteTheLostSchemaLedger`. When the schema ledger losted, if the new consumer subscription or a new producer created, when there is a "Failed to open gotten" such an error, call `tryCompleteTheLostSchemaLedger` method. ```java CompletableFuture<Long> tryCompleteTheLostSchemaLedger(String schemaId, SchemaVersion version, SchemaData schema); ``` This method attempts to create a new ledger save schemaData and then update the new ledger id to the metadata. Connected producers and consumers will work even if scheme ledger is deleted. To get the SchemaData, we need to store the SchemaData and SchemaVersion information in the `org.apache.pulsar.broker.service.Producer` and `org.apache.pulsar.broker.service.Consumer` that are connected or subscribed to the topic on the broker side. When calling `tryCompleteTheLostSchemaLedger` incoming. ### Alternatives 1. In the broker, `org.apache.pulsar.broker.service.Producer` and `org.apache.pulsar.broker.service.Consumer` do not save SchemaData and SchemaVersion, and only call `tryCompleteTheLostSchemaLedger` through the admin api. ### Anything else? _No response_ ### Are you willing to submit a PR? - [X] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
