[ 
https://issues.apache.org/jira/browse/HDFS-16645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17735175#comment-17735175
 ] 

Prateek Agarwal commented on HDFS-16645:
----------------------------------------

[~weichiu] [~xuzq_zander] We are also hitting the same errors on JNs where with 
JN restart, older in_progress files are still lying behind. in the logs, we do 
see JN catching the SIGTERM signal, so doesn't look like it's non-graceful 
restart.
{code}
2023-06-20 04:43:30,743 ERROR 
org.apache.hadoop.hdfs.qjournal.server.JournalNode: RECEIVED SIGNAL 15: SIGTERM
{code}
Can we just ignore these WARN messages as we can see that the JN lag does go 
down eventually after restart?

> Multi inProgress segments caused "Invalid log manifest"
> -------------------------------------------------------
>
>                 Key: HDFS-16645
>                 URL: https://issues.apache.org/jira/browse/HDFS-16645
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> JournalNode will have a residual in-progress segment if it is shut down 
> abnormally. After this JournalNode restarted and Active NameNode try to open 
> a new in-progress segment, this journalnode will contains two in-progress 
> segment, one is the latest segment and another is the residual segment.
> At this moment, NameNode gets one IllegalStateException when trying to 
> getEditLogManifest from this JournalNode, and the exception as bellow:
> {code:java}
> java.lang.IllegalStateException: Invalid log manifest (log [1-? 
> (in-progress)] overlaps [6-? (in-progress)])[[6-? (in-progress)], [1-? 
> (in-progress)]] CommittedTxId: 0 
>         at 
> org.apache.hadoop.hdfs.server.protocol.RemoteEditLogManifest.checkState(RemoteEditLogManifest.java:62)
>       at 
> org.apache.hadoop.hdfs.server.protocol.RemoteEditLogManifest.<init>(RemoteEditLogManifest.java:46)
>       at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.getEditLogManifest(Journal.java:740)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to