Hi everyone,
My SNN failed two days ago, and it stopped to trigger ANN roll edit, so the 
editlog can be 10G large. After i restart the SNN, it failed to fetcher the 
editlog, because it is too large, the log is below :
015-09-22 00:23:07,338 ERROR 
org.apache.hadoop.hdfs.server.namenode.EditLogInputStream: Got error reading 
edit log input stream 
http://**********:8480/getJournal?jid=ns1&segmentTxId=19034359098&storageInfo=-56%3A200185119%3A1401352022932%3ACID-3c312573-1381-44f2-9e8b-fa2529f043d7&ugi=hadoop;
 failing over to edit log 
http://*******:8480/getJournal?jid=ns1&segmentTxId=19034359098&storageInfo=-56%3A200185119%3A1401352022932%3ACID-3c312573-1381-44f2-9e8b-fa2529f043d7&ugi=hadoop
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at 
sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2707)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.FilterInputStream.read(FilterInputStream.java:66)
        at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader$PositionTrackingInputStream.read(FSEditLogLoader.java:1105)
        at java.io.FilterInputStream.read(FilterInputStream.java:66)
        at java.util.zip.CheckedInputStream.read(CheckedInputStream.java:42)

I don’t think it is good idea to set connection timeout in URLFactory, which is 
1 min default.
For now, i can’t restart the SNN, so ANN roll edit per day, and the edit size 
is too large  making SNN impossible to restart.
I am currently developing some utility to resolve this problem.
1. using RPC to ask ANN roll editlog like Editlog Tailer doing
2. Copy all the meta data from SNN to ANN, and read the newest FSImage file and 
read the editlog file on local file system then apply to FSNamesystem, after 
that save namespace to form a new FSImage file
3.After that restart SNN and hope everything goes well


Any idea? i appreciate to get your reply, thank you.

Reply via email to