[
https://issues.apache.org/jira/browse/ASTERIXDB-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17922378#comment-17922378
]
ASF subversion and git services commented on ASTERIXDB-3557:
------------------------------------------------------------
Commit 25404c12e95eec19b8a592d95e2eef6a7299bc2f in asterixdb's branch
refs/heads/master from Peeyush Gupta
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=25404c12e9 ]
[ASTERIXDB-3557][TX] ignore and log failure to read atomic txn log files
- user model changes: no
- storage format changes: no
- interface changes: no
Ext-ref: MB-65039
Change-Id: I5a2e3849cb6fe4a78e0499fb2591f7e734908d04
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/19372
Integration-Tests: Jenkins <[email protected]>
Tested-by: Jenkins <[email protected]>
Reviewed-by: Murtadha Hubail <[email protected]>
> Failure in reading atomic txn log file results in crash loop
> ------------------------------------------------------------
>
> Key: ASTERIXDB-3557
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-3557
> Project: Apache AsterixDB
> Issue Type: Bug
> Components: TX - Transactions
> Reporter: Peeyush Gupta
> Priority: Major
>
> On failures to deserialize an atomic transaction log file during recovery,
> the CC enters a crash loop. In those cases, we need to delete the invalid
> files and continue processing.
> Sample failures:
>
> {{}}
> {noformat}
> 2025-01-28T11:18:30.840+00:00 ERRO CBAS.replication.NcLifecycleCoordinator
> [Executor-13:ClusterController] Node b420f4d7c136b5e56bda9374743cde5a failed
> to complete startup
> org.apache.asterix.common.exceptions.ACIDException:
> java.lang.NullPointerException: Cannot read the array length because "bytes"
> is null
> at
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.rollbackMetadataTransactionsWithoutWAL(TransactionManager.java:225)
> ~[asterix-transactions-1.0.3-2467.jar:1.0.3-2467]
> at
> org.apache.asterix.app.nc.task.LocalStorageCleanupTask.perform(LocalStorageCleanupTask.java:51)
> ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
> at
> org.apache.asterix.app.replication.message.RegistrationTasksResponseMessage.handle(RegistrationTasksResponseMessage.java:63)
> ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
> at
> org.apache.asterix.messaging.NCMessageBroker.lambda$receivedMessage$0(NCMessageBroker.java:108)
> ~[asterix-app-1.0.3-2467.jar:1.0.3-2467]
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> ~[?:?]
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> ~[?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> [?:?]
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> [?:?]
> at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
> Caused by: java.lang.NullPointerException
> at java.base/java.lang.String.<init>(String.java:1437) ~[?:?]
> at
> org.apache.asterix.transaction.management.service.transaction.TransactionManager.rollbackMetadataTransactionsWithoutWAL(TransactionManager.java:215)
> ~[asterix-transactions-1.0.3-2467.jar:1.0.3-2467]
> ... 8 more{noformat}
> {{}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)