[ 
https://issues.apache.org/jira/browse/HDFS-14646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Cao updated HDFS-14646:
------------------------------
    Description: 
*Problem Description:*
 In the multi-NameNode scenario, when a SNN uploads a FsImage, it will put the 
image to all other NNs (whether the peer NN is an ANN or not), and even if the 
peer NN immediately replies an error (such as 
TransferResult.NOT_ACTIVE_NAMENODE_FAILURE, TransferResult 
.OLD_TRANSACTION_ID_FAILURE, etc.), the local SNN will not terminate the put 
process immediately, but will put the FsImage completely to the peer NN, and 
will not read the peer NN's reply until the put is completed.

Depending on the version of Jetty, this behavior can lead to different 
consequences, I tested it under 2.7.2 and trunk version. 

*1.In Hadoop 2.7.2 (with Jetty 6.1.26)*
 After peer NN called HttpServletResponse.sendError(), the underlying TCP 
connection will still be established, and the data SNN sent will be read by 
Jetty framework itself in the peer NN side, so the SNN will insignificantly 
send the FsImage to the peer NN continuously, causing a waste of time and 
bandwidth. In a relatively large HDFS cluster, the size of FsImage can often 
reach about 30GB, This is indeed a big waste.

*2.In trunk version (with Jetty 9.3.27)*
 After peer NN called HttpServletResponse.sendError(), the underlying TCP 
connection will be auto closed, and then SNN will directly get an "Error 
writing request body to server" exception, as below:
{code:java}
2019-08-17 03:59:25,413 INFO namenode.TransferFsImage: Sending fileName: 
/tmp/hadoop-root/dfs/name/current/fsimage_0000000000003364240, fileSize: 
9864721. Sent total: 524288 bytes. Size of last segment intended to send: 4096 
bytes.
 java.io.IOException: Error writing request body to server
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:314)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:249)
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:277)
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:272)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 2019-08-17 03:59:25,422 INFO namenode.TransferFsImage: Sending fileName: 
/tmp/hadoop-root/dfs/name/current/fsimage_0000000000003364240, fileSize: 
9864721. Sent total: 851968 bytes. Size of last segment intended to send: 4096 
bytes.
 java.io.IOException: Error writing request body to server
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
  {code}
                  

*Solution:*
 A standby NameNode should not upload fsimage to an inappropriate NameNode, 
when he plans to put a FsImage to the peer NN, he need to check whether he 
really need to put it at this time.

  was:
*Problem Description:*
 In the multi-NameNode scenario, when a SNN uploads a FsImage, it will put the 
image to all other NNs (whether the peer NN is an ANN or not), and even if the 
peer NN immediately replies an error (such as 
TransferResult.NOT_ACTIVE_NAMENODE_FAILURE, TransferResult 
.OLD_TRANSACTION_ID_FAILURE, etc.), the local SNN will not terminate the put 
process immediately, but will put the FsImage completely to the peer NN, and 
will not read the peer NN's reply until the put is completed.

Depending on the version of Jetty, this behavior can lead to different 
consequences, I tested it under 2.7.2 and trunk version.

 

*1.In Hadoop 2.7.2 (with Jetty 6.1.26)*
After peer NN called HttpServletResponse.sendError(), the underlying TCP 
connection will still be established, and the data SNN sent will be read by 
Jetty framework itself in the peer NN side, so the SNN will insignificantly 
send the FsImage to the peer NN continuously, causing a waste of time and 
bandwidth. In a relatively large HDFS cluster, the size of FsImage can often 
reach about 30GB, This is indeed a big waste.

*2.In trunk version (with Jetty 9.3.27)*
 After peer NN called HttpServletResponse.sendError(), the underlying TCP 
connection will be auto closed, and then SNN will directly get an "Error 
writing request body to server" exception, as below:
{code:java}
2019-08-17 03:59:25,413 INFO namenode.TransferFsImage: Sending fileName: 
/tmp/hadoop-root/dfs/name/current/fsimage_0000000000003364240, fileSize: 
9864721. Sent total: 524288 bytes. Size of last segment intended to send: 4096 
bytes.
 java.io.IOException: Error writing request body to server
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:314)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:249)
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:277)
 at 
org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:272)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 2019-08-17 03:59:25,422 INFO namenode.TransferFsImage: Sending fileName: 
/tmp/hadoop-root/dfs/name/current/fsimage_0000000000003364240, fileSize: 
9864721. Sent total: 851968 bytes. Size of last segment intended to send: 4096 
bytes.
 java.io.IOException: Error writing request body to server
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
 at 
sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
 at 
org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
  {code}
                  

*Solution:*
 A standby NameNode should not upload fsimage to an inappropriate NameNode, 
when he plans to put a FsImage to the peer NN, he need to check whether he 
really need to put it at this time.


> Standby NameNode should not upload fsimage to an inappropriate NameNode.
> ------------------------------------------------------------------------
>
>                 Key: HDFS-14646
>                 URL: https://issues.apache.org/jira/browse/HDFS-14646
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.2
>            Reporter: Xudong Cao
>            Assignee: Xudong Cao
>            Priority: Major
>         Attachments: HDFS-14646.000.patch, HDFS-14646.001.patch
>
>
> *Problem Description:*
>  In the multi-NameNode scenario, when a SNN uploads a FsImage, it will put 
> the image to all other NNs (whether the peer NN is an ANN or not), and even 
> if the peer NN immediately replies an error (such as 
> TransferResult.NOT_ACTIVE_NAMENODE_FAILURE, TransferResult 
> .OLD_TRANSACTION_ID_FAILURE, etc.), the local SNN will not terminate the put 
> process immediately, but will put the FsImage completely to the peer NN, and 
> will not read the peer NN's reply until the put is completed.
> Depending on the version of Jetty, this behavior can lead to different 
> consequences, I tested it under 2.7.2 and trunk version. 
> *1.In Hadoop 2.7.2 (with Jetty 6.1.26)*
>  After peer NN called HttpServletResponse.sendError(), the underlying TCP 
> connection will still be established, and the data SNN sent will be read by 
> Jetty framework itself in the peer NN side, so the SNN will insignificantly 
> send the FsImage to the peer NN continuously, causing a waste of time and 
> bandwidth. In a relatively large HDFS cluster, the size of FsImage can often 
> reach about 30GB, This is indeed a big waste.
> *2.In trunk version (with Jetty 9.3.27)*
>  After peer NN called HttpServletResponse.sendError(), the underlying TCP 
> connection will be auto closed, and then SNN will directly get an "Error 
> writing request body to server" exception, as below:
> {code:java}
> 2019-08-17 03:59:25,413 INFO namenode.TransferFsImage: Sending fileName: 
> /tmp/hadoop-root/dfs/name/current/fsimage_0000000000003364240, fileSize: 
> 9864721. Sent total: 524288 bytes. Size of last segment intended to send: 
> 4096 bytes.
>  java.io.IOException: Error writing request body to server
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImage(TransferFsImage.java:314)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:277)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer$1.call(StandbyCheckpointer.java:272)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  2019-08-17 03:59:25,422 INFO namenode.TransferFsImage: Sending fileName: 
> /tmp/hadoop-root/dfs/name/current/fsimage_0000000000003364240, fileSize: 
> 9864721. Sent total: 851968 bytes. Size of last segment intended to send: 
> 4096 bytes.
>  java.io.IOException: Error writing request body to server
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3587)
>  at 
> sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3570)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.copyFileToStream(TransferFsImage.java:396)
>  at 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage.writeFileToPutRequest(TransferFsImage.java:340)
>   {code}
>                   
> *Solution:*
>  A standby NameNode should not upload fsimage to an inappropriate NameNode, 
> when he plans to put a FsImage to the peer NN, he need to check whether he 
> really need to put it at this time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to