[jira] [Commented] (HADOOP-11851) s3n to swallow IOEs on inner stream close

2015-04-20 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14503893#comment-14503893
 ] 

Takenori Sato commented on HADOOP-11851:


Isn't this the duplicate of HADOOP-11730?

> s3n to swallow IOEs on inner stream close
> -
>
> Key: HADOOP-11851
> URL: https://issues.apache.org/jira/browse/HADOOP-11851
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 2.6.0
>Reporter: Steve Loughran
>Assignee: Anu Engineer
>Priority: Minor
>
> We've seen a situation where some work was failing from (recurrent) 
> connection reset exceptions.
> Irrespective of the root cause, these were surfacing not in the read 
> operations, but when the input stream was being closed -including during a 
> seek()
> These exceptions could be caught & logged & warn, rather than trigger 
> immediate failures. It shouldn't matter to the next GET whether the last 
> stream closed prematurely, as long as the new one works



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-04-02 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392251#comment-14392251
 ] 

Takenori Sato commented on HADOOP-11742:


_mkdir_ and _ls_ worked as expected with the fix.

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -Dfs.s3a.access.key=ACCESS_KEY 
-Dfs.s3a.secret.key=SECRET_KEY -ls s3a://s3atest/
15/04/02 06:52:55 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 06:52:55 DEBUG s3a.S3AFileSystem: s3a://s3atest/ is empty? true
15/04/02 06:52:55 DEBUG s3a.S3AFileSystem: List status for path: s3a://s3atest/
15/04/02 06:52:55 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 06:52:55 DEBUG s3a.S3AFileSystem: s3a://s3atest/ is empty? true
15/04/02 06:52:55 DEBUG s3a.S3AFileSystem: listStatus: doing listObjects for 
directory 
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -Dfs.s3a.access.key=ACCESS_KEY 
-Dfs.s3a.secret.key=SECRET_KEY -mkdir s3a://s3atest/root
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/root (root)
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3atest/root
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: s3a://s3atest/ is empty? true
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Making directory: s3a://s3atest/root
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/root (root)
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3atest/root
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/root (root)
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3atest/root
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 06:53:20 DEBUG s3a.S3AFileSystem: s3a://s3atest/ is empty? true
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -Dfs.s3a.access.key=ACCESS_KEY 
-Dfs.s3a.secret.key=SECRET_KEY -ls s3a://s3atest/
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: s3a://s3atest/ is empty? false
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: List status for path: s3a://s3atest/
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: s3a://s3atest/ is empty? false
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: listStatus: doing listObjects for 
directory 
15/04/02 06:53:26 DEBUG s3a.S3AFileSystem: Adding: rd: s3a://s3atest/root
Found 1 items
drwxrwxrwx   -  0 1970-01-01 00:00 s3a://s3atest/root 
{code}

The created directory didn't become visible immediately. But the successive 
_ls_ showed it was successful.

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Assignee: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch, HADOOP-11742-branch-2.7.003-1.patch, 
> HADOOP-11742-branch-2.7.003-2.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-04-01 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392231#comment-14392231
 ] 

Takenori Sato commented on HADOOP-11742:


Patches are verified as follows.

1. run TestS3AContractRootDir to see it succeeds

{code}
---
 T E S T S
---
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.855 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir

Results :

Tests run: 5, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 10.341 s
[INFO] Finished at: 2015-04-02T05:41:48+00:00
[INFO] Final Memory: 28M/407M
[INFO] 
{code}

2. apply the test patch(003-2), and run TestS3AContractRootDir

{code}
---
 T E S T S
---
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 21.296 sec <<< 
FAILURE! - in org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
testRmEmptyRootDirNonRecursive(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir)
  Time elapsed: 4.608 sec  <<< ERROR!
java.io.FileNotFoundException: No such file or directory: /
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:996)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory(ContractTestUtils.java:464)
at 
org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive(AbstractContractRootDirectoryTest.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

testRmRootRecursive(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir)  
Time elapsed: 2.509 sec  <<< ERROR!
java.io.FileNotFoundException: No such file or directory: /
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:996)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory(ContractTestUtils.java:464)
at 
org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testRmRootRecursive(AbstractContractRootDirectoryTest.java:104)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

testCreateFileOverRoot(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir)
  Time elapsed: 3.006 sec  <<< ERROR!
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS 
Service: Amazon S3, AWS Request ID: 2B352694A5577C62, AWS Error Code: 
MalformedXML, AWS Err

[jira] [Updated] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-04-01 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11742:
---
Attachment: HADOOP-11742-branch-2.7.003-2.patch

This is the patch to fix the unit test, _AbstractContractRootDirectoryTest_.

Changes are:
# setup() prepares an empty directory
# assertion was added to make sure the root dir is empty in 
testRmEmptyRootDirNonRecursive()
# teardown() does nothing

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Assignee: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch, HADOOP-11742-branch-2.7.003-1.patch, 
> HADOOP-11742-branch-2.7.003-2.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-04-01 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11742:
---
Attachment: HADOOP-11742-branch-2.7.003-1.patch

This is the patch to fix _S3AFileSystem#getFileStatus_. The dedicated part to 
process a root directory was added, which is entered only when key.isEmpty() == 
true.

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Assignee: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch, HADOOP-11742-branch-2.7.003-1.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-04-01 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato reopened HADOOP-11742:


I confirmed mkdir fails on an empty bucket for AWS as follows:

1. make sure the bucket is empty, but get an exception

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -Dfs.s3a.access.key=ACCESS_KEY 
-Dfs.s3a.secret.key=SECRET_KEY -ls s3a://s3atest/
15/04/02 01:49:09 DEBUG http.wire: >> "HEAD / HTTP/1.1[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Host: s3atest.s3.amazonaws.com[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Authorization: AWS XXX=[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Date: Thu, 02 Apr 2015 01:49:08 
GMT[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "User-Agent: aws-sdk-java/1.7.4 
Linux/3.10.0-123.8.1.el7.centos.plus.x86_64 
Java_HotSpot(TM)_64-Bit_Server_VM/24.75-b04/1.7.0_75[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Content-Type: 
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Connection: Keep-Alive[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "HTTP/1.1 200 OK[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "x-amz-id-2: XXX[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "x-amz-request-id: XXX[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Date: Thu, 02 Apr 2015 01:49:10 
GMT[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Content-Type: application/xml[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Transfer-Encoding: chunked[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Server: AmazonS3[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "[\r][\n]"
15/04/02 01:49:09 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3atest/ ()
15/04/02 01:49:09 DEBUG http.wire: >> "GET /?delimiter=%2F&max-keys=1&prefix= 
HTTP/1.1[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Host: s3atest.s3.amazonaws.com[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Authorization: AWS XXX=[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Date: Thu, 02 Apr 2015 01:49:09 
GMT[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "User-Agent: aws-sdk-java/1.7.4 
Linux/3.10.0-123.8.1.el7.centos.plus.x86_64 
Java_HotSpot(TM)_64-Bit_Server_VM/24.75-b04/1.7.0_75[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Content-Type: 
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "Connection: Keep-Alive[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: >> "[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "HTTP/1.1 200 OK[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "x-amz-id-2: XXX[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "x-amz-request-id: XXX[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Date: Thu, 02 Apr 2015 01:49:10 
GMT[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Content-Type: application/xml[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Transfer-Encoding: chunked[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "Server: AmazonS3[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "fe[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "[\n]"
15/04/02 01:49:09 DEBUG http.wire: << "http://s3.amazonaws.com/doc/2006-03-01/";>s3atest1/false"
15/04/02 01:49:09 DEBUG http.wire: << "[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "0[\r][\n]"
15/04/02 01:49:09 DEBUG http.wire: << "[\r][\n]"
15/04/02 01:49:09 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3atest/
ls: `s3a://s3atest/': No such file or directory
{code}

2. create a directory, but get an exception

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -Dfs.s3a.access.key=ACCESS_KEY 
-Dfs.s3a.secret.key=SECRET_KEY -mkdir s3a://s3atest/root
15/04/02 01:49:41 DEBUG http.wire: >> "HEAD / HTTP/1.1[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "Host: s3atest.s3.amazonaws.com[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "Authorization: AWS XXX=[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "Date: Thu, 02 Apr 2015 01:49:41 
GMT[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "User-Agent: aws-sdk-java/1.7.4 
Linux/3.10.0-123.8.1.el7.centos.plus.x86_64 
Java_HotSpot(TM)_64-Bit_Server_VM/24.75-b04/1.7.0_75[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "Content-Type: 
application/x-www-form-urlencoded; charset=utf-8[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "Connection: Keep-Alive[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: >> "[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "HTTP/1.1 200 OK[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "x-amz-id-2: XXX[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "x-amz-request-id: XXX[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "Date: Thu, 02 Apr 2015 01:49:42 
GMT[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "Content-Type: application/xml[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "Transfer-Encoding: chunked[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "Server: AmazonS3[\r][\n]"
15/04/02 01:49:41 DEBUG http.wire: << "[\r][\n]"
15/04

[jira] [Commented] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-30 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386493#comment-14386493
 ] 

Takenori Sato commented on HADOOP-11753:


Thanks, it makes sense. I will discuss internally.

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0, 2.7.0
>Reporter: Takenori Sato
>Assignee: Takenori Sato
> Attachments: HADOOP-11753-branch-2.7.001.patch
>
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> This is because the header is wrong when calling _S3AInputStream#read_ after 
> _S3AInputStream#open_.
> {code}
> Range: bytes=0--1
> * from 0 to -1
> {code}
> Tested on the latest branch-2.7.
> {quote}
> $ git log
> commit d286673c602524af08935ea132c8afd181b6e2e4
> Author: Jitendra Pandey 
> Date:   Tue Mar 24 16:17:06 2015 -0700
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-29 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato reopened HADOOP-11742:


Reopen to mark this as invalid.

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Assignee: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-29 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato resolved HADOOP-11742.

Resolution: Invalid

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Assignee: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-29 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386172#comment-14386172
 ] 

Takenori Sato commented on HADOOP-11742:


Thomas, Steve, yes, again this is against our own. I will check the difference. 
Let me close.

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-29 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11742:
---
Resolution: Fixed
  Assignee: Takenori Sato
Status: Resolved  (was: Patch Available)

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.7.0
> Environment: CentOS 7
>Reporter: Takenori Sato
>Assignee: Takenori Sato
>Priority: Minor
> Attachments: HADOOP-11742-branch-2.7.001.patch, 
> HADOOP-11742-branch-2.7.002.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-29 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato resolved HADOOP-11753.

Resolution: Invalid

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0, 2.7.0
>Reporter: Takenori Sato
>Assignee: Takenori Sato
> Attachments: HADOOP-11753-branch-2.7.001.patch
>
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> This is because the header is wrong when calling _S3AInputStream#read_ after 
> _S3AInputStream#open_.
> {code}
> Range: bytes=0--1
> * from 0 to -1
> {code}
> Tested on the latest branch-2.7.
> {quote}
> $ git log
> commit d286673c602524af08935ea132c8afd181b6e2e4
> Author: Jitendra Pandey 
> Date:   Tue Mar 24 16:17:06 2015 -0700
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-29 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14386168#comment-14386168
 ] 

Takenori Sato commented on HADOOP-11753:


Thanks for the clarification. Yes, this is against Cloudian. So let me close. 
Will check AWS as well for further tests.

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.0.0, 2.7.0
>Reporter: Takenori Sato
>Assignee: Takenori Sato
> Attachments: HADOOP-11753-branch-2.7.001.patch
>
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> This is because the header is wrong when calling _S3AInputStream#read_ after 
> _S3AInputStream#open_.
> {code}
> Range: bytes=0--1
> * from 0 to -1
> {code}
> Tested on the latest branch-2.7.
> {quote}
> $ git log
> commit d286673c602524af08935ea132c8afd181b6e2e4
> Author: Jitendra Pandey 
> Date:   Tue Mar 24 16:17:06 2015 -0700
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-26 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11742:
---
Attachment: HADOOP-11742-branch-2.7.002.patch

I found _AbstractFSContractTestBase#setup_ always creates a test directory, 
which is removed at _teardown_. Thus, an empty directory was not tested in 
concrete test cases.

The problem here is not calling mkdir on an empty bucket. But when you call 
_S3AFileSystem#getFileStatus("/")_ on an empty bucket, it throws an exception.

To setup such a condition, I rather chose to remove the test directory at 
setup, then no-op at teardown.

Then, without this fix, TestS3AContractRootDir failed as follows.

{code}
---
 T E S T S
---
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
Tests run: 5, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 8.027 sec <<< 
FAILURE! - in org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
testRmEmptyRootDirNonRecursive(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir)
  Time elapsed: 2.82 sec  <<< ERROR!
java.io.FileNotFoundException: No such file or directory: /
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:995)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory(ContractTestUtils.java:464)
at 
org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive(AbstractContractRootDirectoryTest.java:63)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

testRmRootRecursive(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir)  
Time elapsed: 0.475 sec  <<< ERROR!
java.io.FileNotFoundException: No such file or directory: /
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:995)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:77)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory(ContractTestUtils.java:464)
at 
org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testRmRootRecursive(AbstractContractRootDirectoryTest.java:96)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

testCreateFileOverRoot(org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir)
  Time elapsed: 2.922 sec  <<< ERROR!
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS 
Service: Amazon S3, AWS Request ID: 368CF290D38711E4, AWS Error Code: 
MalformedXML, AWS Error Message: The XML you provided was not well-formed or 
did not validate against our published schema.
at 
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at 
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at 
com.amazonaws.services.s3.AmazonS3Cl

[jira] [Updated] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-25 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11753:
---
Attachment: HADOOP-11753-branch-2.7.001.patch

set Range header only when contentLength > 0

{code}
---
 T E S T S
---
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractCreate
Tests run: 6, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 19.821 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractCreate
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractDelete
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.186 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractDelete
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractMkdir
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.563 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractMkdir
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 10.412 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 25.687 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractRename
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.29 sec - in 
org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
Running org.apache.hadoop.fs.contract.s3a.TestS3AContractSeek
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.943 sec - 
in org.apache.hadoop.fs.contract.s3a.TestS3AContractSeek
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractCreate
Tests run: 6, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 16.791 sec - in 
org.apache.hadoop.fs.contract.s3n.TestS3NContractCreate
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractDelete
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.891 sec - in 
org.apache.hadoop.fs.contract.s3n.TestS3NContractDelete
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractMkdir
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.791 sec - in 
org.apache.hadoop.fs.contract.s3n.TestS3NContractMkdir
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractOpen
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.736 sec - in 
org.apache.hadoop.fs.contract.s3n.TestS3NContractOpen
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractRename
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.308 sec - in 
org.apache.hadoop.fs.contract.s3n.TestS3NContractRename
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractRootDir
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.716 sec - in 
org.apache.hadoop.fs.contract.s3n.TestS3NContractRootDir
Running org.apache.hadoop.fs.contract.s3n.TestS3NContractSeek
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.433 sec - 
in org.apache.hadoop.fs.contract.s3n.TestS3NContractSeek
Running org.apache.hadoop.fs.s3.TestInMemoryS3FileSystemContract
Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.641 sec - in 
org.apache.hadoop.fs.s3.TestInMemoryS3FileSystemContract
Running org.apache.hadoop.fs.s3.TestINode
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.127 sec - in 
org.apache.hadoop.fs.s3.TestINode
Running org.apache.hadoop.fs.s3.TestS3Credentials
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.36 sec - in 
org.apache.hadoop.fs.s3.TestS3Credentials
Running org.apache.hadoop.fs.s3.TestS3FileSystem
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.362 sec - in 
org.apache.hadoop.fs.s3.TestS3FileSystem
Running org.apache.hadoop.fs.s3.TestS3InMemoryFileSystem
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.754 sec - in 
org.apache.hadoop.fs.s3.TestS3InMemoryFileSystem
Running org.apache.hadoop.fs.s3a.scale.TestS3ADeleteManyFiles
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 500.943 sec - 
in org.apache.hadoop.fs.s3a.scale.TestS3ADeleteManyFiles
Running org.apache.hadoop.fs.s3a.TestS3ABlocksize
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.25 sec - in 
org.apache.hadoop.fs.s3a.TestS3ABlocksize
Running org.apache.hadoop.fs.s3a.TestS3AConfiguration
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.334 sec - in 
org.apache.hadoop.fs.s3a.TestS3AConfiguration
Running org.apache.hadoop.fs.s3a.TestS3AFastOutputStream
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 9.867 sec - in 
org.apache.hadoop.fs.s3a.TestS3AFastOutputStream
Running org.apache.hadoop.fs.s3a.TestS3AFileSystemContract
Tests run: 31, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 79.965 sec - 
in org.apache.ha

[jira] [Commented] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-25 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381384#comment-14381384
 ] 

Takenori Sato commented on HADOOP-11753:


Hi, OK, thanks. I was about to start. So I leave this to you.

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Takenori Sato
>Assignee: J.Andreina
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> This is because the header is wrong when calling _S3AInputStream#read_ after 
> _S3AInputStream#open_.
> {code}
> Range: bytes=0--1
> * from 0 to -1
> {code}
> Tested on the latest branch-2.7.
> {quote}
> $ git log
> commit d286673c602524af08935ea132c8afd181b6e2e4
> Author: Jitendra Pandey 
> Date:   Tue Mar 24 16:17:06 2015 -0700
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-25 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381351#comment-14381351
 ] 

Takenori Sato commented on HADOOP-11742:


OK, will do. But found the current s3a related unit tests won't finish 
successfully. Filed as HADOOP-11753. 

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
> Environment: CentOS 7
>Reporter: Takenori Sato
> Attachments: HADOOP-11742-branch-2.7.001.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-25 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381343#comment-14381343
 ] 

Takenori Sato commented on HADOOP-11753:


Another one.

{code}
testSeekZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractSeek)  
Time elapsed: 9.478 sec  <<< ERROR!
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
Service: Amazon S3, AWS Request ID: 29E6B1A0D37011E4, AWS Error Code: 
InvalidRange, AWS Error Message: The requested range cannot be satisfied.
at 
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at 
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at 
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at 
org.apache.hadoop.fs.contract.AbstractContractSeekTest.testSeekZeroByteFile(AbstractContractSeekTest.java:88)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Takenori Sato
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.

[jira] [Commented] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-25 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14381338#comment-14381338
 ] 

Takenori Sato commented on HADOOP-11753:


_TestS3AContractSeek#testBlockReadZeroByteFile_ fails by the same reason, too.

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Takenori Sato
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> This is because the header is wrong when calling _S3AInputStream#read_ after 
> _S3AInputStream#open_.
> {code}
> Range: bytes=0--1
> * from 0 to -1
> {code}
> Tested on the latest branch-2.7.
> {quote}
> $ git log
> commit d286673c602524af08935ea132c8afd181b6e2e4
> Author: Jitendra Pandey 
> Date:   Tue Mar 24 16:17:06 2015 -0700
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range header

2015-03-25 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11753:
---
Summary: TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative 
range header  (was: TestS3AContractOpen#testOpenReadZeroByteFile fals due to 
negative range header)

> TestS3AContractOpen#testOpenReadZeroByteFile fails due to negative range 
> header
> ---
>
> Key: HADOOP-11753
> URL: https://issues.apache.org/jira/browse/HADOOP-11753
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Takenori Sato
>
> _TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.
> {code}
> testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen)
>   Time elapsed: 3.312 sec  <<< ERROR!
> com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
> Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
> InvalidRange, AWS Error Message: The requested range cannot be satisfied.
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
>   at 
> org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
>   at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
>   at java.io.FilterInputStream.read(FilterInputStream.java:83)
>   at 
> org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> This is because the header is wrong when calling _S3AInputStream#read_ after 
> _S3AInputStream#open_.
> {code}
> Range: bytes=0--1
> * from 0 to -1
> {code}
> Tested on the latest branch-2.7.
> {quote}
> $ git log
> commit d286673c602524af08935ea132c8afd181b6e2e4
> Author: Jitendra Pandey 
> Date:   Tue Mar 24 16:17:06 2015 -0700
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11753) TestS3AContractOpen#testOpenReadZeroByteFile fals due to negative range header

2015-03-25 Thread Takenori Sato (JIRA)
Takenori Sato created HADOOP-11753:
--

 Summary: TestS3AContractOpen#testOpenReadZeroByteFile fals due to 
negative range header
 Key: HADOOP-11753
 URL: https://issues.apache.org/jira/browse/HADOOP-11753
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Reporter: Takenori Sato


_TestS3AContractOpen#testOpenReadZeroByteFile_ fails as follows.

{code}
testOpenReadZeroByteFile(org.apache.hadoop.fs.contract.s3a.TestS3AContractOpen) 
 Time elapsed: 3.312 sec  <<< ERROR!
com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 416, AWS 
Service: Amazon S3, AWS Request ID: A58A95E0D36811E4, AWS Error Code: 
InvalidRange, AWS Error Message: The requested range cannot be satisfied.
at 
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at 
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at 
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:91)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.openIfNeeded(S3AInputStream.java:62)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:127)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at 
org.apache.hadoop.fs.contract.AbstractContractOpenTest.testOpenReadZeroByteFile(AbstractContractOpenTest.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{code}

This is because the header is wrong when calling _S3AInputStream#read_ after 
_S3AInputStream#open_.

{code}
Range: bytes=0--1
* from 0 to -1
{code}

Tested on the latest branch-2.7.

{quote}
$ git log
commit d286673c602524af08935ea132c8afd181b6e2e4
Author: Jitendra Pandey 
Date:   Tue Mar 24 16:17:06 2015 -0700
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-24 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14379380#comment-14379380
 ] 

Takenori Sato commented on HADOOP-11742:


I checked how this is covered in test cases. 
_NativeS3FileSystemContractBaseTest#testListStatusForRoot_ looks like a 
relevant test for s3n. But something similar is not found for s3a.

But _TestS3AContractRootDir_ is supposed to test this scenario, correct?

> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
> Environment: CentOS 7
>Reporter: Takenori Sato
> Attachments: HADOOP-11742-branch-2.7.001.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-24 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11742:
---
Attachment: HADOOP-11742-branch-2.7.001.patch

An empty key means a root directory instead of "Not Found". This is the same 
behavior as _NativeS3FileSystem#getFileStatus_.

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
15/03/25 06:28:05 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/25 06:28:05 DEBUG s3a.S3AFileSystem: List status for path: s3a://s3a/
15/03/25 06:28:05 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/25 06:28:05 DEBUG s3a.S3AFileSystem: listStatus: doing listObjects for 
directory 
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
15/03/25 06:28:22 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3a/foo (foo)
15/03/25 06:28:23 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
15/03/25 06:28:23 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/25 06:28:23 DEBUG s3a.S3AFileSystem: Making directory: s3a://s3a/foo
15/03/25 06:28:23 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3a/foo (foo)
15/03/25 06:28:23 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
15/03/25 06:28:23 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3a/foo (foo)
15/03/25 06:28:24 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
15/03/25 06:28:24 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
15/03/25 06:28:31 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/25 06:28:31 DEBUG s3a.S3AFileSystem: List status for path: s3a://s3a/
15/03/25 06:28:31 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/25 06:28:31 DEBUG s3a.S3AFileSystem: listStatus: doing listObjects for 
directory 
15/03/25 06:28:31 DEBUG s3a.S3AFileSystem: Adding: rd: s3a://s3a/foo
Found 1 items
drwxrwxrwx   -  0 1970-01-01 00:00 s3a://s3a/foo
{code}


> mkdir by file system shell fails on an empty bucket
> ---
>
> Key: HADOOP-11742
> URL: https://issues.apache.org/jira/browse/HADOOP-11742
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
> Environment: CentOS 7
>Reporter: Takenori Sato
> Attachments: HADOOP-11742-branch-2.7.001.patch
>
>
> I have built the latest 2.7, and tried S3AFileSystem.
> Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as 
> follows:
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
> 15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
> s3a://s3a/foo (foo)
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> mkdir: `s3a://s3a/foo': No such file or directory
> {code}
> So does _ls_.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ 
> ()
> 15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
> ls: `s3a://s3a/': No such file or directory
> {code}
> This is how it works via s3n.
> {code}
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
> # hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
> Found 1 items
> drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
> {code}
> The snapshot is the following:
> {quote}
> \# git branch
> \* branch-2.7
>   trunk
> \# git log
> commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
> Author: Harsh J 
> Date:   Sun Mar 22 10:18:32 2015 +0530
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11742) mkdir by file system shell fails on an empty bucket

2015-03-23 Thread Takenori Sato (JIRA)
Takenori Sato created HADOOP-11742:
--

 Summary: mkdir by file system shell fails on an empty bucket
 Key: HADOOP-11742
 URL: https://issues.apache.org/jira/browse/HADOOP-11742
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
 Environment: CentOS 7
Reporter: Takenori Sato


I have built the latest 2.7, and tried S3AFileSystem.

Then found that _mkdir_ fails on an empty bucket, named *s3a* here, as follows:

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3a://s3a/foo
15/03/24 03:49:35 DEBUG s3a.S3AFileSystem: Getting path status for 
s3a://s3a/foo (foo)
15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/foo
15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/24 03:49:36 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
mkdir: `s3a://s3a/foo': No such file or directory
{code}

So does _ls_.

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3a://s3a/
15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Getting path status for s3a://s3a/ ()
15/03/24 03:47:48 DEBUG s3a.S3AFileSystem: Not Found: s3a://s3a/
ls: `s3a://s3a/': No such file or directory
{code}

This is how it works via s3n.

{code}
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -mkdir s3n://s3n/foo
# hadoop-2.7.0-SNAPSHOT/bin/hdfs dfs -ls s3n://s3n/
Found 1 items
drwxrwxrwx   -  0 1970-01-01 00:00 s3n://s3n/foo
{code}

The snapshot is the following:

{quote}
\# git branch
\* branch-2.7
  trunk
\# git log
commit 929b04ce3a4fe419dece49ed68d4f6228be214c1
Author: Harsh J 
Date:   Sun Mar 22 10:18:32 2015 +0530
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HADOOP-11730) The broken s3n read retry logic causes a wrong output being committed

2015-03-19 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11730:
---
Attachment: (was: HADOOP-11730-branch-2.6.0.001.patch)

> The broken s3n read retry logic causes a wrong output being committed
> -
>
> Key: HADOOP-11730
> URL: https://issues.apache.org/jira/browse/HADOOP-11730
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Takenori Sato
>Assignee: Takenori Sato
> Attachments: HADOOP-11730-branch-2.6.0.001.patch
>
>
> s3n attempts to read again when it encounters IOException during read. But 
> the current logic does not reopen the connection, thus, it ends up with 
> no-op, and committing the wrong(truncated) output.
> Here's a stack trace as an example.
> {quote}
> 2015-03-13 20:17:24,835 [TezChild] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor - 
> Starting output org.apache.tez.mapreduce.output.MROutput@52008dbd to vertex 
> scope-12
> 2015-03-13 20:17:24,866 [TezChild] DEBUG 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream - 
> Released HttpMethod as its response data stream threw an exception
> org.apache.http.ConnectionClosedException: Premature end of Content-Length 
> delimited message body (expected: 296587138; received: 155648
>   at 
> org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:184)
>   at 
> org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
>   at 
> org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:78)
>   at 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:146)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:145)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at java.io.DataInputStream.read(DataInputStream.java:100)
>   at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
>   at 
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
>   at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
>   at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:185)
>   at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:259)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:246)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:91)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:117)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:313)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:192)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> 

[jira] [Updated] (HADOOP-11730) The broken s3n read retry logic causes a wrong output being committed

2015-03-19 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11730:
---
Attachment: HADOOP-11730-branch-2.6.0.001.patch

The first patch with the updated test case.

> The broken s3n read retry logic causes a wrong output being committed
> -
>
> Key: HADOOP-11730
> URL: https://issues.apache.org/jira/browse/HADOOP-11730
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Takenori Sato
>Assignee: Takenori Sato
> Attachments: HADOOP-11730-branch-2.6.0.001.patch
>
>
> s3n attempts to read again when it encounters IOException during read. But 
> the current logic does not reopen the connection, thus, it ends up with 
> no-op, and committing the wrong(truncated) output.
> Here's a stack trace as an example.
> {quote}
> 2015-03-13 20:17:24,835 [TezChild] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor - 
> Starting output org.apache.tez.mapreduce.output.MROutput@52008dbd to vertex 
> scope-12
> 2015-03-13 20:17:24,866 [TezChild] DEBUG 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream - 
> Released HttpMethod as its response data stream threw an exception
> org.apache.http.ConnectionClosedException: Premature end of Content-Length 
> delimited message body (expected: 296587138; received: 155648
>   at 
> org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:184)
>   at 
> org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
>   at 
> org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:78)
>   at 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:146)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:145)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at java.io.DataInputStream.read(DataInputStream.java:100)
>   at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
>   at 
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
>   at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
>   at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:185)
>   at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:259)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:246)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:91)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:117)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:313)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:192)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
>   at java.util.concurrent.FutureTask.run(F

[jira] [Updated] (HADOOP-11730) The broken s3n read retry logic causes a wrong output being committed

2015-03-19 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato updated HADOOP-11730:
---
Attachment: HADOOP-11730-branch-2.6.0.001.patch

The first proposal without the test case.

2015-03-20 12:05:08,473 [TezChild] INFO  
org.apache.hadoop.fs.s3native.NativeS3FileSystem - Received IOException while 
reading 'user/hadoop/tsato/readlarge/input/cloudian-s3.log.20141119', 
attempting to reopen.
2015-03-20 12:05:08,473 [TezChild] DEBUG 
org.jets3t.service.impl.rest.httpclient.RestStorageService - Retrieving All 
information for bucket shared and object 
user/hadoop/tsato/readlarge/input/cloudian-s3.log.20141119

Verified manually that it reopens a new connection after IOException.



> The broken s3n read retry logic causes a wrong output being committed
> -
>
> Key: HADOOP-11730
> URL: https://issues.apache.org/jira/browse/HADOOP-11730
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.6.0
> Environment: HDP 2.2
>Reporter: Takenori Sato
>Assignee: Takenori Sato
> Attachments: HADOOP-11730-branch-2.6.0.001.patch
>
>
> s3n attempts to read again when it encounters IOException during read. But 
> the current logic does not reopen the connection, thus, it ends up with 
> no-op, and committing the wrong(truncated) output.
> Here's a stack trace as an example.
> {quote}
> 2015-03-13 20:17:24,835 [TezChild] INFO  
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor - 
> Starting output org.apache.tez.mapreduce.output.MROutput@52008dbd to vertex 
> scope-12
> 2015-03-13 20:17:24,866 [TezChild] DEBUG 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream - 
> Released HttpMethod as its response data stream threw an exception
> org.apache.http.ConnectionClosedException: Premature end of Content-Length 
> delimited message body (expected: 296587138; received: 155648
>   at 
> org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:184)
>   at 
> org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
>   at 
> org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:78)
>   at 
> org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:146)
>   at 
> org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:145)
>   at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
>   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
>   at java.io.DataInputStream.read(DataInputStream.java:100)
>   at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
>   at 
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
>   at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
>   at 
> org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:185)
>   at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:259)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:246)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:91)
>   at 
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:117)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:313)
>   at 
> org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:192)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
>   at 
> org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTa

[jira] [Created] (HADOOP-11730) The broken s3n read retry logic causes a wrong output being committed

2015-03-19 Thread Takenori Sato (JIRA)
Takenori Sato created HADOOP-11730:
--

 Summary: The broken s3n read retry logic causes a wrong output 
being committed
 Key: HADOOP-11730
 URL: https://issues.apache.org/jira/browse/HADOOP-11730
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: 2.6.0
 Environment: HDP 2.2
Reporter: Takenori Sato
Assignee: Takenori Sato


s3n attempts to read again when it encounters IOException during read. But the 
current logic does not reopen the connection, thus, it ends up with no-op, and 
committing the wrong(truncated) output.

Here's a stack trace as an example.

{quote}
2015-03-13 20:17:24,835 [TezChild] INFO  
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor - 
Starting output org.apache.tez.mapreduce.output.MROutput@52008dbd to vertex 
scope-12
2015-03-13 20:17:24,866 [TezChild] DEBUG 
org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream - Released 
HttpMethod as its response data stream threw an exception
org.apache.http.ConnectionClosedException: Premature end of Content-Length 
delimited message body (expected: 296587138; received: 155648
at 
org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:184)
at 
org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
at 
org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:78)
at 
org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:146)
at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:145)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
at 
org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at 
org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:185)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:259)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at 
org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
at 
org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:106)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:246)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:91)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
at 
org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:117)
at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:313)
at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:192)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-03-13 20:17:24,867 [TezChild] INFO  
org.apache.hadoop.fs.s3native.NativeS3FileSystem - Received IOException while 
reading 'user/hadoop/tsato/readlarge/input/clou

[jira] [Resolved] (HADOOP-10037) s3n read truncated, but doesn't throw exception

2015-03-19 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato resolved HADOOP-10037.

Resolution: Fixed

The issue that had reopened this turned out being a separate issue.

> s3n read truncated, but doesn't throw exception 
> 
>
> Key: HADOOP-10037
> URL: https://issues.apache.org/jira/browse/HADOOP-10037
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.0.0-alpha
> Environment: Ubuntu Linux 13.04 running on Amazon EC2 (cc2.8xlarge)
>Reporter: David Rosenstrauch
> Fix For: 2.6.0
>
> Attachments: S3ReadFailedOnTruncation.html, S3ReadSucceeded.html
>
>
> For months now we've been finding that we've been experiencing frequent data 
> truncation issues when reading from S3 using the s3n:// protocol.  I finally 
> was able to gather some debugging output on the issue in a job I ran last 
> night, and so can finally file a bug report.
> The job I ran last night was on a 16-node cluster (all of them AWS EC2 
> cc2.8xlarge machines, running Ubuntu 13.04 and Cloudera CDH4.3.0).  The job 
> was a Hadoop streaming job, which reads through a large number (i.e., 
> ~55,000) of files on S3, each of them approximately 300K bytes in size.
> All of the files contain 46 columns of data in each record.  But I added in 
> an extra check in my mapper code to count and verify the number of columns in 
> every record - throwing an error and crashing the map task if the column 
> count is wrong.
> If you look in the attached task logs, you'll see 2 attempts on the same 
> task.  The first one fails due to data truncated (i.e., my job intentionally 
> fails the map task due to the current record failing the column count check). 
>  The task then gets retried on a different machine and runs to a succesful 
> completion.
> You can see further evidence of the truncation further down in the task logs, 
> where it displays the count of the records read:  the failed task says 32953 
> records read, while the successful task says 63133.
> Any idea what the problem might be here and/or how to work around it?  This 
> issue is a very common occurrence on our clusters.  E.g., in the job I ran 
> last night before I had gone to bed I had already encountered 8 such 
> failuers, and the job was only 10% complete.  (~25,000 out of ~250,000 tasks.)
> I realize that it's common for I/O errors to occur - possibly even frequently 
> - in a large Hadoop job.  But I would think that if an I/O failure (like a 
> truncated read) did occur, that something in the underlying infrastructure 
> code (i.e., either in NativeS3FileSystem or in jets3t) should detect the 
> error and throw an IOException accordingly.  It shouldn't be up to the 
> calling code to detect such failures, IMO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HADOOP-10037) s3n read truncated, but doesn't throw exception

2015-03-19 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14370292#comment-14370292
 ] 

Takenori Sato commented on HADOOP-10037:


David, thanks for your clarification.

I heard from Steve that my issue was introduced by some optimizations done for 
2.4.

So let me close this as FIXED. I will create a new issue for mine.

> s3n read truncated, but doesn't throw exception 
> 
>
> Key: HADOOP-10037
> URL: https://issues.apache.org/jira/browse/HADOOP-10037
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 2.0.0-alpha
> Environment: Ubuntu Linux 13.04 running on Amazon EC2 (cc2.8xlarge)
>Reporter: David Rosenstrauch
> Fix For: 2.6.0
>
> Attachments: S3ReadFailedOnTruncation.html, S3ReadSucceeded.html
>
>
> For months now we've been finding that we've been experiencing frequent data 
> truncation issues when reading from S3 using the s3n:// protocol.  I finally 
> was able to gather some debugging output on the issue in a job I ran last 
> night, and so can finally file a bug report.
> The job I ran last night was on a 16-node cluster (all of them AWS EC2 
> cc2.8xlarge machines, running Ubuntu 13.04 and Cloudera CDH4.3.0).  The job 
> was a Hadoop streaming job, which reads through a large number (i.e., 
> ~55,000) of files on S3, each of them approximately 300K bytes in size.
> All of the files contain 46 columns of data in each record.  But I added in 
> an extra check in my mapper code to count and verify the number of columns in 
> every record - throwing an error and crashing the map task if the column 
> count is wrong.
> If you look in the attached task logs, you'll see 2 attempts on the same 
> task.  The first one fails due to data truncated (i.e., my job intentionally 
> fails the map task due to the current record failing the column count check). 
>  The task then gets retried on a different machine and runs to a succesful 
> completion.
> You can see further evidence of the truncation further down in the task logs, 
> where it displays the count of the records read:  the failed task says 32953 
> records read, while the successful task says 63133.
> Any idea what the problem might be here and/or how to work around it?  This 
> issue is a very common occurrence on our clusters.  E.g., in the job I ran 
> last night before I had gone to bed I had already encountered 8 such 
> failuers, and the job was only 10% complete.  (~25,000 out of ~250,000 tasks.)
> I realize that it's common for I/O errors to occur - possibly even frequently 
> - in a large Hadoop job.  But I would think that if an I/O failure (like a 
> truncated read) did occur, that something in the underlying infrastructure 
> code (i.e., either in NativeS3FileSystem or in jets3t) should detect the 
> error and throw an IOException accordingly.  It shouldn't be up to the 
> calling code to detect such failures, IMO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HADOOP-10037) s3n read truncated, but doesn't throw exception

2015-03-18 Thread Takenori Sato (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-10037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takenori Sato reopened HADOOP-10037:


I confirmed this happens on Hadoop 2.6.0, and found the reason.

Here's the stacktrace.

{quote}

2015-03-13 20:17:24,866 [TezChild] DEBUG 
org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream - Released 
HttpMethod as its response data stream threw an exception
org.apache.http.ConnectionClosedException: Premature end of Content-Length 
delimited message body (expected: 296587138; received: 155648
at 
org.apache.http.impl.io.ContentLengthInputStream.read(ContentLengthInputStream.java:184)
at 
org.apache.http.conn.EofSensorInputStream.read(EofSensorInputStream.java:138)
at 
org.jets3t.service.io.InterruptableInputStream.read(InterruptableInputStream.java:78)
at 
org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream.read(HttpMethodReleaseInputStream.java:146)
at 
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.read(NativeS3FileSystem.java:145)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:180)
at 
org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:216)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at 
org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:185)
at org.apache.pig.builtin.PigStorage.getNext(PigStorage.java:259)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204)
at 
org.apache.tez.mapreduce.lib.MRReaderMapReduce.next(MRReaderMapReduce.java:116)
at 
org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POSimpleTezLoad.getNextTuple(POSimpleTezLoad.java:106)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:246)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNextTuple(POFilter.java:91)
at 
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:307)
at 
org.apache.pig.backend.hadoop.executionengine.tez.plan.operator.POStoreTez.getNextTuple(POStoreTez.java:117)
at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.runPipeline(PigProcessor.java:313)
at 
org.apache.pig.backend.hadoop.executionengine.tez.runtime.PigProcessor.run(PigProcessor.java:192)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168)
at 
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2015-03-13 20:17:24,867 [TezChild] INFO  
org.apache.hadoop.fs.s3native.NativeS3FileSystem - Received IOException while 
reading 'user/hadoop/tsato/readlarge/input/cloudian-s3.log.20141119', 
attempting to reopen.
2015-03-13 20:17:24,867 [TezChild] DEBUG 
org.jets3t.service.impl.rest.httpclient.HttpMethodReleaseInputStream - Released 
HttpMethod as its response data stream is fully consumed
2015-03-13 20:17:24,868 [TezChild] INFO  
org.apache.tez.dag.app.TaskAttemptListenerImpTezDag - Commit go/no-go request 
from attempt_1426245338920_0001_1_00_04_0
2015-03-13 20:17:24,868 [TezChild] INFO  
org.apache.tez.dag.app.dag.impl.TaskImpl - 
attempt_1426245338920_0001_1_00_04_0 given a go for committing the task 
output.

{quote}

The problem is that a job successfully finishes after the exception. T

[jira] [Commented] (HADOOP-10400) Incorporate new S3A FileSystem implementation

2014-08-20 Thread Takenori Sato (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104967#comment-14104967
 ] 

Takenori Sato commented on HADOOP-10400:


Hi Jordan Mendelson,

I came from HADOOP-10643, where you suggested that a new improvement over 
NativeS3FileSystem should be done here.

So I've made 2 pull requests for your upstream repository.

1. make endpoint configurable
https://github.com/Aloisius/hadoop-s3a/pull/8

jets3t allows a user to configure an endpoint(protocol, host, and port) through 
jets3t.properties. But a user can't configure without calling a particular 
method with AmazonSDK. This fix is to simply allow it.

2. subclass of AbstractFileSystem
https://github.com/Aloisius/hadoop-s3a/pull/9

This contains a fix for a similar problem as HADOOP-10643. The difference is 
that this fix is simpler, and now modification to AbstractFileSystem.
Also, when using this subclass, HADOOP-8984 becomes obvious, so whose fix is 
included as well.


Btw, on my test with Pig, I needed to apply the following fix to make this work.
"Ensure the file is open before trying to seek"
https://github.com/Aloisius/hadoop-s3a/pull/6

> Incorporate new S3A FileSystem implementation
> -
>
> Key: HADOOP-10400
> URL: https://issues.apache.org/jira/browse/HADOOP-10400
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3
>Affects Versions: 2.4.0
>Reporter: Jordan Mendelson
>Assignee: Jordan Mendelson
> Attachments: HADOOP-10400-1.patch, HADOOP-10400-2.patch, 
> HADOOP-10400-3.patch, HADOOP-10400-4.patch, HADOOP-10400-5.patch, 
> HADOOP-10400-6.patch
>
>
> The s3native filesystem has a number of limitations (some of which were 
> recently fixed by HADOOP-9454). This patch adds an s3a filesystem which uses 
> the aws-sdk instead of the jets3t library. There are a number of improvements 
> over s3native including:
> - Parallel copy (rename) support (dramatically speeds up commits on large 
> files)
> - AWS S3 explorer compatible empty directories files "xyz/" instead of 
> "xyz_$folder$" (reduces littering)
> - Ignores s3native created _$folder$ files created by s3native and other S3 
> browsing utilities
> - Supports multiple output buffer dirs to even out IO when uploading files
> - Supports IAM role-based authentication
> - Allows setting a default canned ACL for uploads (public, private, etc.)
> - Better error recovery handling
> - Should handle input seeks without having to download the whole file (used 
> for splits a lot)
> This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to 
> various pom files to get it to build against trunk. I've been using 0.0.1 in 
> production with CDH 4 for several months and CDH 5 for a few days. The 
> version here is 0.0.2 which changes around some keys to hopefully bring the 
> key name style more inline with the rest of hadoop 2.x.
> *Tunable parameters:*
> fs.s3a.access.key - Your AWS access key ID (omit for role authentication)
> fs.s3a.secret.key - Your AWS secret key (omit for role authentication)
> fs.s3a.connection.maximum - Controls how many parallel connections 
> HttpClient spawns (default: 15)
> fs.s3a.connection.ssl.enabled - Enables or disables SSL connections to S3 
> (default: true)
> fs.s3a.attempts.maximum - How many times we should retry commands on 
> transient errors (default: 10)
> fs.s3a.connection.timeout - Socket connect timeout (default: 5000)
> fs.s3a.paging.maximum - How many keys to request from S3 when doing 
> directory listings at a time (default: 5000)
> fs.s3a.multipart.size - How big (in bytes) to split a upload or copy 
> operation up into (default: 104857600)
> fs.s3a.multipart.threshold - Until a file is this large (in bytes), use 
> non-parallel upload (default: 2147483647)
> fs.s3a.acl.default - Set a canned ACL on newly created/copied objects 
> (private | public-read | public-read-write | authenticated-read | 
> log-delivery-write | bucket-owner-read | bucket-owner-full-control)
> fs.s3a.multipart.purge - True if you want to purge existing multipart 
> uploads that may not have been completed/aborted correctly (default: false)
> fs.s3a.multipart.purge.age - Minimum age in seconds of multipart uploads 
> to purge (default: 86400)
> fs.s3a.buffer.dir - Comma separated list of directories that will be used 
> to buffer file writes out of (default: uses ${hadoop.tmp.dir}/s3a )
> *Caveats*:
> Hadoop uses a standard output committer which uploads files as 
> filename.COPYING before renaming them. This can cause unnecessary performance 
> issues with S3 because it does not have a rename operation and S3 already 
> verifies uploads against an md5 that the driver sets on the upload request. 
> While this FileSyste