[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST
[ https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387350#comment-16387350 ] SammiChen commented on HDFS-12654: -- Hi, [~Nuke] and [~iwasakims], seem it's not a issue after the further investigation. Can it be closed? > APPEND API call is different in HTTPFS and NameNode REST > > > Key: HDFS-12654 > URL: https://issues.apache.org/jira/browse/HDFS-12654 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, httpfs, namenode >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1 >Reporter: Andras Czesznak >Priority: Major > > The APPEND REST API call behaves differently in the NameNode REST and the > HTTPFS codes. The NameNode version creates the target file the new data being > appended to if it does not exist at the time of the call issued. The HTTPFS > version assumes the target file exists when APPEND is called and can append > only the new data but does not create the target file it doesn't exist. > The two implementations should be standardized, preferably the HTTPFS version > should be modified to execute an implicit CREATE if the target file does not > exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST
[ https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344419#comment-16344419 ] Masatake Iwasaki commented on HDFS-12654: - Thanks for the info. WebHDFS does not create a new file when append is requested. It returns 404. NamenodeWebHdfsMethods#chooseDataNode:: {noformat} } else if (op == GetOpParam.Op.OPEN || op == GetOpParam.Op.GETFILECHECKSUM || op == PostOpParam.Op.APPEND) { //choose a datanode containing a replica final NamenodeProtocols np = getRPCServer(namenode); final HdfsFileStatus status = np.getFileInfo(path); if (status == null) { throw new FileNotFoundException("File " + path + " not found."); } {noformat} The non-existent file seems to be crated by fluent-plugin-webhdfs. out_webhdfs.rb:: {noformat} def send_data(path, data) if @append begin @client.append(path, data) rescue WebHDFS::FileNotFoundError @client.create(path, data) end {noformat} The issue stated in the ticket is that WebHDFS returns 404 but HttpFs returns 500. I could not reproduce this. {quote} WebHDFS::ServerError means that the client (fluentd) receives HTTP response code 500 from HttpFs server. WebHDFS server returns 404 for such cases. {quote} > APPEND API call is different in HTTPFS and NameNode REST > > > Key: HDFS-12654 > URL: https://issues.apache.org/jira/browse/HDFS-12654 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, httpfs, namenode >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1 >Reporter: Andras Czesznak >Priority: Major > > The APPEND REST API call behaves differently in the NameNode REST and the > HTTPFS codes. The NameNode version creates the target file the new data being > appended to if it does not exist at the time of the call issued. The HTTPFS > version assumes the target file exists when APPEND is called and can append > only the new data but does not create the target file it doesn't exist. > The two implementations should be standardized, preferably the HTTPFS version > should be modified to execute an implicit CREATE if the target file does not > exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST
[ https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336386#comment-16336386 ] Andras Czesznak commented on HDFS-12654: Hello [~iwasakims] San, the issue occurred with the FluentD app and described here: https://github.com/fluent/fluent-plugin-webhdfs/issues/46 >From the HTTPFS's log: ``` 2017-10-03 16:20:59,204 WARN org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as: (auth:PROXY) via httpfs (auth:SIMPLE) cause:java.io.FileNotFoundException: failed to append to non-existent file /fluentd/process/.log for client at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2930) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3227) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3191) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:614) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:126) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) ``` This is what it looks like in FluentD log ``` 2017-10-04 17:23:54 -0700 [warn]: failed to communicate hdfs cluster, path: /fluentd/process/.log 2017-10-04 17:23:54 -0700 [warn]: temporarily failed to flush the buffer. next_retry=2017-10-04 17:24:24 -0700 error_class="WebHDFS::ServerError" error="Failed to connect to host :14000, Broken pipe - sendfile" plugin_id="object:3f8778538560" 2017-10-04 17:23:54 -0700 [warn]: suppressed same stacktrace The source code snippets: 1) HTTPFS: direct FS output stream creation /hadoop/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java ... 551 /** 552 * Append to an existing file (optional operation). 553 * 554 * IMPORTANT: The Progressable parameter is not used. 555 * 556 * @param f the existing file to be appended. 557 * @param bufferSize the size of the buffer to be used. 558 * @param progress for reporting progress if it is not null. 559 * 560 * @throws IOException 561 */ 562 @Override 563 public FSDataOutputStream append(Path f, int bufferSize, 564 Progressable progress) throws IOException { 565 Mapparams = new HashMap (); 566 params.put(OP_PARAM, Operation.APPEND.toString()); 567 return uploadData(Operation.APPEND.getMethod(), f, params, bufferSize, 568 HttpURLConnection.HTTP_OK); 569 } ... 2) WebHDFS: indirect FS output stream creation thru DFS ... /hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java ... 54 import org.apache.hadoop.fs.FSDataOutputStream; ... 795 /** 796 * Handle create/append output streams 797 */ 798 class FsPathOutputStreamRunner extends AbstractFsPathRunner { 799 private final int bufferSize; 800 801 FsPathOutputStreamRunner(Op op, Path fspath, int bufferSize, 802 Param... parameters) { 803 super(op, fspath, parameters); 804 this.bufferSize = bufferSize; 805 } 806 807 @Override 808 FSDataOutputStream getResponse(final HttpURLConnection conn) 809 throws IOException { 810 return new FSDataOutputStream(new BufferedOutputStream( 811 conn.getOutputStream(), bufferSize), statistics) { 812 @Override 813 public void close() throws IOException { 814 try { 815 super.close(); 816 } finally { 817 try { 818 validateResponse(op, conn, true); 819 } finally { 820 conn.disconnect(); 821 } 822 } 823 } 824 }; 825 } 826 } ... > APPEND API call is different in HTTPFS and NameNode REST > > > Key: HDFS-12654 > URL: https://issues.apache.org/jira/browse/HDFS-12654 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, httpfs, namenode >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1 >Reporter: Andras Czesznak >Priority: Major > > The APPEND REST API call behaves differently in the NameNode REST and the > HTTPFS codes. The
[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST
[ https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327000#comment-16327000 ] Masatake Iwasaki commented on HDFS-12654: - Hi [~Nuke], could you explain the way to reproduce the issue? I got the same result for webhdfs and httpfs when I tested this on 3.1.0-SNAPSHOT. {noformat} $ curl -X POST -T README.txt -L -i "http://localhost:9870/webhdfs/v1/tmp/README.txt?op=APPEND; HTTP/1.1 100 Continue HTTP/1.1 404 Not Found Date: Tue, 16 Jan 2018 10:41:24 GMT Cache-Control: no-cache Expires: Tue, 16 Jan 2018 10:41:24 GMT Date: Tue, 16 Jan 2018 10:41:24 GMT Pragma: no-cache X-FRAME-OPTIONS: SAMEORIGIN Content-Type: application/json Transfer-Encoding: chunked {"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File /tmp/README.txt not found."}}[centos@ip-172-32-1-195 hadoop-3.1.0-SNAPSHOT]$ {noformat} {noformat} $ curl -X POST -i --header "Content-Type:application/octet-stream" --data-binary @README.txt 'http://localhost:14000/webhdfs/v1/tmp/README.txt?op=APPEND=centos=true' HTTP/1.1 100 Continue HTTP/1.1 404 Not Found Date: Tue, 16 Jan 2018 10:42:17 GMT Cache-Control: no-cache Expires: Tue, 16 Jan 2018 10:42:17 GMT Date: Tue, 16 Jan 2018 10:42:17 GMT Pragma: no-cache Set-Cookie: hadoop.auth="u=centos=centos=simple-dt=1516135337720=ZDSoknG+/x/a6XnGLAVUyUBs6vE="; Path=/; HttpOnly Content-Type: application/json Transfer-Encoding: chunked {"RemoteException":{"message":"Failed to append to non-existent file \/tmp\/README.txt for client 127.0.0.1","exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException"}} {noformat} > APPEND API call is different in HTTPFS and NameNode REST > > > Key: HDFS-12654 > URL: https://issues.apache.org/jira/browse/HDFS-12654 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, httpfs, namenode >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1 >Reporter: Andras Czesznak >Priority: Major > > The APPEND REST API call behaves differently in the NameNode REST and the > HTTPFS codes. The NameNode version creates the target file the new data being > appended to if it does not exist at the time of the call issued. The HTTPFS > version assumes the target file exists when APPEND is called and can append > only the new data but does not create the target file it doesn't exist. > The two implementations should be standardized, preferably the HTTPFS version > should be modified to execute an implicit CREATE if the target file does not > exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org