[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST

2018-03-05 Thread SammiChen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387350#comment-16387350
 ] 

SammiChen commented on HDFS-12654:
--

Hi,  [~Nuke] and [~iwasakims], seem it's not a issue after the further 
investigation. Can it be closed? 

> APPEND API call is different in HTTPFS and NameNode REST
> 
>
> Key: HDFS-12654
> URL: https://issues.apache.org/jira/browse/HDFS-12654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, httpfs, namenode
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1
>Reporter: Andras Czesznak
>Priority: Major
>
> The APPEND REST API call behaves differently in the NameNode REST and the 
> HTTPFS codes. The NameNode version creates the target file the new data being 
> appended to if it does not exist at the time of the call issued. The HTTPFS 
> version assumes the target file exists when APPEND is called and can append 
> only the new data but does not create the target file it doesn't exist.
> The two implementations should be standardized, preferably the HTTPFS version 
> should be modified to execute an implicit CREATE if the target file does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST

2018-01-29 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344419#comment-16344419
 ] 

Masatake Iwasaki commented on HDFS-12654:
-

Thanks for the info.

WebHDFS does not create a new file when append is requested. It returns 404. 
NamenodeWebHdfsMethods#chooseDataNode::
{noformat}
} else if (op == GetOpParam.Op.OPEN
|| op == GetOpParam.Op.GETFILECHECKSUM
|| op == PostOpParam.Op.APPEND) {
  //choose a datanode containing a replica
  final NamenodeProtocols np = getRPCServer(namenode);
  final HdfsFileStatus status = np.getFileInfo(path);
  if (status == null) {
throw new FileNotFoundException("File " + path + " not found.");
  }
{noformat}

The non-existent file seems to be crated by fluent-plugin-webhdfs. 
out_webhdfs.rb::
{noformat}
  def send_data(path, data)
if @append
  begin
@client.append(path, data)
  rescue WebHDFS::FileNotFoundError
@client.create(path, data)
  end
{noformat}

The issue stated in the ticket is that WebHDFS returns 404 but HttpFs returns 
500. I could not reproduce this.
{quote}
WebHDFS::ServerError means that the client (fluentd) receives HTTP response 
code 500 from HttpFs server. WebHDFS server returns 404 for such cases.
{quote}
 

> APPEND API call is different in HTTPFS and NameNode REST
> 
>
> Key: HDFS-12654
> URL: https://issues.apache.org/jira/browse/HDFS-12654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, httpfs, namenode
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1
>Reporter: Andras Czesznak
>Priority: Major
>
> The APPEND REST API call behaves differently in the NameNode REST and the 
> HTTPFS codes. The NameNode version creates the target file the new data being 
> appended to if it does not exist at the time of the call issued. The HTTPFS 
> version assumes the target file exists when APPEND is called and can append 
> only the new data but does not create the target file it doesn't exist.
> The two implementations should be standardized, preferably the HTTPFS version 
> should be modified to execute an implicit CREATE if the target file does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST

2018-01-23 Thread Andras Czesznak (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16336386#comment-16336386
 ] 

Andras Czesznak commented on HDFS-12654:


Hello [~iwasakims] San, the issue occurred with the FluentD app and described 
here:

https://github.com/fluent/fluent-plugin-webhdfs/issues/46

 

>From the HTTPFS's log:
```
2017-10-03 16:20:59,204 WARN org.apache.hadoop.security.UserGroupInformation: 
PriviledgedActionException as: (auth:PROXY) via httpfs (auth:SIMPLE) 
cause:java.io.FileNotFoundException: failed to append to non-existent file 
/fluentd/process/.log for client 
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2930)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3227)
 at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3191)
 at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:614)
 at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:126)
 at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416)
 at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
```

This is what it looks like in FluentD log
```
2017-10-04 17:23:54 -0700 [warn]: failed to communicate hdfs cluster, path: 
/fluentd/process/.log
2017-10-04 17:23:54 -0700 [warn]: temporarily failed to flush the buffer. 
next_retry=2017-10-04 17:24:24 -0700 error_class="WebHDFS::ServerError" 
error="Failed to connect to host :14000, Broken pipe - sendfile" 
plugin_id="object:3f8778538560"
 2017-10-04 17:23:54 -0700 [warn]: suppressed same stacktrace

 

The source code snippets:

1) HTTPFS: direct FS output stream creation 
/hadoop/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
...
551 /**
552 * Append to an existing file (optional operation).
553 * 
554 * IMPORTANT: The Progressable parameter is not used.
555 *
556 * @param f the existing file to be appended.
557 * @param bufferSize the size of the buffer to be used.
558 * @param progress for reporting progress if it is not null.
559 *
560 * @throws IOException
561 */
562 @Override
563 public FSDataOutputStream append(Path f, int bufferSize,
564 Progressable progress) throws IOException {
565 Map params = new HashMap();
566 params.put(OP_PARAM, Operation.APPEND.toString());
567 return uploadData(Operation.APPEND.getMethod(), f, params, bufferSize,
568 HttpURLConnection.HTTP_OK);
569 }
...

2) WebHDFS: indirect FS output stream creation thru DFS
...
/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
...
54 import org.apache.hadoop.fs.FSDataOutputStream;
...
795 /**
796 * Handle create/append output streams
797 */
798 class FsPathOutputStreamRunner extends 
AbstractFsPathRunner {
799 private final int bufferSize;
800
801 FsPathOutputStreamRunner(Op op, Path fspath, int bufferSize,
802 Param... parameters) {
803 super(op, fspath, parameters);
804 this.bufferSize = bufferSize;
805 }
806
807 @Override
808 FSDataOutputStream getResponse(final HttpURLConnection conn)
809 throws IOException {
810 return new FSDataOutputStream(new BufferedOutputStream(
811 conn.getOutputStream(), bufferSize), statistics) {
812 @Override
813 public void close() throws IOException {
814 try {
815 super.close();
816 } finally {
817 try {
818 validateResponse(op, conn, true);
819 } finally {
820 conn.disconnect();
821 }
822 }
823 }
824 };
825 }
826 }
...

 

> APPEND API call is different in HTTPFS and NameNode REST
> 
>
> Key: HDFS-12654
> URL: https://issues.apache.org/jira/browse/HDFS-12654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, httpfs, namenode
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1
>Reporter: Andras Czesznak
>Priority: Major
>
> The APPEND REST API call behaves differently in the NameNode REST and the 
> HTTPFS codes. The 

[jira] [Commented] (HDFS-12654) APPEND API call is different in HTTPFS and NameNode REST

2018-01-16 Thread Masatake Iwasaki (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327000#comment-16327000
 ] 

Masatake Iwasaki commented on HDFS-12654:
-

Hi [~Nuke], could you explain the way to reproduce the issue? I got the same 
result for webhdfs and httpfs when I tested this on 3.1.0-SNAPSHOT.

{noformat}
$ curl -X POST -T README.txt -L -i 
"http://localhost:9870/webhdfs/v1/tmp/README.txt?op=APPEND;
HTTP/1.1 100 Continue

HTTP/1.1 404 Not Found
Date: Tue, 16 Jan 2018 10:41:24 GMT
Cache-Control: no-cache
Expires: Tue, 16 Jan 2018 10:41:24 GMT
Date: Tue, 16 Jan 2018 10:41:24 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Content-Type: application/json
Transfer-Encoding: chunked

{"RemoteException":{"exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException","message":"File
 /tmp/README.txt not found."}}[centos@ip-172-32-1-195 hadoop-3.1.0-SNAPSHOT]$
{noformat}

{noformat}
$ curl -X POST -i --header "Content-Type:application/octet-stream" 
--data-binary @README.txt 
'http://localhost:14000/webhdfs/v1/tmp/README.txt?op=APPEND=centos=true'
HTTP/1.1 100 Continue

HTTP/1.1 404 Not Found
Date: Tue, 16 Jan 2018 10:42:17 GMT
Cache-Control: no-cache
Expires: Tue, 16 Jan 2018 10:42:17 GMT
Date: Tue, 16 Jan 2018 10:42:17 GMT
Pragma: no-cache
Set-Cookie: 
hadoop.auth="u=centos=centos=simple-dt=1516135337720=ZDSoknG+/x/a6XnGLAVUyUBs6vE=";
 Path=/; HttpOnly
Content-Type: application/json
Transfer-Encoding: chunked

{"RemoteException":{"message":"Failed to append to non-existent file 
\/tmp\/README.txt for client 
127.0.0.1","exception":"FileNotFoundException","javaClassName":"java.io.FileNotFoundException"}}
{noformat}
 

 

> APPEND API call is different in HTTPFS and NameNode REST
> 
>
> Key: HDFS-12654
> URL: https://issues.apache.org/jira/browse/HDFS-12654
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, httpfs, namenode
>Affects Versions: 2.6.0, 2.7.0, 2.8.0, 3.0.0-beta1
>Reporter: Andras Czesznak
>Priority: Major
>
> The APPEND REST API call behaves differently in the NameNode REST and the 
> HTTPFS codes. The NameNode version creates the target file the new data being 
> appended to if it does not exist at the time of the call issued. The HTTPFS 
> version assumes the target file exists when APPEND is called and can append 
> only the new data but does not create the target file it doesn't exist.
> The two implementations should be standardized, preferably the HTTPFS version 
> should be modified to execute an implicit CREATE if the target file does not 
> exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org