RE: HDFS to S3 copy issues
Hi Momina, Could it be that you misspelled the port in your source path, you mind trying with: hdfs://10.240.113.162:9000/data/ Ivan -Original Message- From: Momina Khan [mailto:momina.a...@gmail.com] Sent: Thursday, July 05, 2012 10:30 PM To: common-dev@hadoop.apache.org Subject: HDFS to S3 copy issues hi ... hope someone is able to help me out with this ... have tried an exhaustive search of google and AWS forum but there is little help in this regard and all that i found didnt work for me! i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url is correct i tried the fs -cat command which works just fine ... spits contents of the file ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt* but when i try to distance copy the file from hdfs (same location as above) to my s3 bucket it says connection to server refused! have looked up Google exhaustively but cannot get an answer. they say that the port may be blocked but have checked that 9000-9001 are not blocked could it be an autghentication issue? just saying ... out of ideas. Find the call trace attached below: ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina * 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data] 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 time(s). 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 time(s). 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 time(s). 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 time(s). 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 time(s). 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 time(s). 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 time(s). 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 time(s). 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 time(s). 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 time(s). With failures, global counters are inaccurate; consider running with -i Copy failed: java.net.ConnectException: Call to domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at
[jira] [Created] (HADOOP-8569) CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE
Colin Patrick McCabe created HADOOP-8569: Summary: CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE Key: HADOOP-8569 URL: https://issues.apache.org/jira/browse/HADOOP-8569 Project: Hadoop Common Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor In the native code, we should define _GNU_SOURCE and _LARGEFILE_SOURCE so that all of the functions on Linux are available. _LARGEFILE enables fseeko and ftello; _GNU_SOURCE enables a variety of Linux-specific functions from glibc, including sync_file_range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: HDFS to S3 copy issues
hi Ivan, i have tried with both ports 9000 and 9001 i get the same error dump ... best momina On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote: Hi Momina, Could it be that you misspelled the port in your source path, you mind trying with: hdfs://10.240.113.162:9000/data/ Ivan -Original Message- From: Momina Khan [mailto:momina.a...@gmail.com] Sent: Thursday, July 05, 2012 10:30 PM To: common-dev@hadoop.apache.org Subject: HDFS to S3 copy issues hi ... hope someone is able to help me out with this ... have tried an exhaustive search of google and AWS forum but there is little help in this regard and all that i found didnt work for me! i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url is correct i tried the fs -cat command which works just fine ... spits contents of the file ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt* but when i try to distance copy the file from hdfs (same location as above) to my s3 bucket it says connection to server refused! have looked up Google exhaustively but cannot get an answer. they say that the port may be blocked but have checked that 9000-9001 are not blocked could it be an autghentication issue? just saying ... out of ideas. Find the call trace attached below: ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina * 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data] 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 time(s). 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 time(s). 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 time(s). 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 time(s). 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 time(s). 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 time(s). 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 time(s). 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 time(s). 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 time(s). 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 time(s). With failures, global counters are inaccurate; consider running with -i Copy failed: java.net.ConnectException: Call to domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656) at org.apache.hadoop.tools.DistCp.run(DistCp.java:881) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.tools.DistCp.main(DistCp.java:908) Caused by: java.net.ConnectException: Connection refused at
Re: HDFS to S3 copy issues
you may want to try following command instead of using hdfs try hftp hadoop -i -ppgu -log /tmp/mylog -m 20 distcp hftp://servername:port/path (hdfs://target.server:port/path | s3://id:sercret@domain) On Fri, Jul 6, 2012 at 12:19 PM, Momina Khan momina.a...@gmail.com wrote: hi Ivan, i have tried with both ports 9000 and 9001 i get the same error dump ... best momina On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote: Hi Momina, Could it be that you misspelled the port in your source path, you mind trying with: hdfs://10.240.113.162:9000/data/ Ivan -Original Message- From: Momina Khan [mailto:momina.a...@gmail.com] Sent: Thursday, July 05, 2012 10:30 PM To: common-dev@hadoop.apache.org Subject: HDFS to S3 copy issues hi ... hope someone is able to help me out with this ... have tried an exhaustive search of google and AWS forum but there is little help in this regard and all that i found didnt work for me! i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url is correct i tried the fs -cat command which works just fine ... spits contents of the file ubuntu@domU-12-31-39-04-6E-58 :/state/partition1/hadoop-1.0.1$ *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt* but when i try to distance copy the file from hdfs (same location as above) to my s3 bucket it says connection to server refused! have looked up Google exhaustively but cannot get an answer. they say that the port may be blocked but have checked that 9000-9001 are not blocked could it be an autghentication issue? just saying ... out of ideas. Find the call trace attached below: ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina * 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data] 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 time(s). 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 time(s). 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 time(s). 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 time(s). 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 time(s). 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 time(s). 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 time(s). 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 time(s). 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 time(s). 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 time(s). With failures, global counters are inaccurate; consider running with -i Copy failed: java.net.ConnectException: Call to domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635) at
[jira] [Created] (HADOOP-8570) Bzip2Codec should accept .bz files too
Harsh J created HADOOP-8570: --- Summary: Bzip2Codec should accept .bz files too Key: HADOOP-8570 URL: https://issues.apache.org/jira/browse/HADOOP-8570 Project: Hadoop Common Issue Type: Improvement Components: io Affects Versions: 2.0.0-alpha, 1.0.0 Reporter: Harsh J The default extension reported for Bzip2Codec today is .bz2. This causes it not to pick up .bz files as Bzip2Codec files. Although the extension is not very popular today, it is still mentioned as a valid extension in the bunzip manual and we should support it. We should change the Bzip2Codec default extension to bz, or we should add in a new extension list support to allow for better detection across various aliases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: HDFS to S3 copy issues
hi Momina maybe the problem is your DNS Resolution. You must have IP hostname enteries if all nodes in /etc/hosts file. like this 127.0.0.1 localhost On Fri, Jul 6, 2012 at 2:49 PM, Momina Khan momina.a...@gmail.com wrote: hi Ivan, i have tried with both ports 9000 and 9001 i get the same error dump ... best momina On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote: Hi Momina, Could it be that you misspelled the port in your source path, you mind trying with: hdfs://10.240.113.162:9000/data/ Ivan -Original Message- From: Momina Khan [mailto:momina.a...@gmail.com] Sent: Thursday, July 05, 2012 10:30 PM To: common-dev@hadoop.apache.org Subject: HDFS to S3 copy issues hi ... hope someone is able to help me out with this ... have tried an exhaustive search of google and AWS forum but there is little help in this regard and all that i found didnt work for me! i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url is correct i tried the fs -cat command which works just fine ... spits contents of the file ubuntu@domU-12-31-39-04-6E-58 :/state/partition1/hadoop-1.0.1$ *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt* but when i try to distance copy the file from hdfs (same location as above) to my s3 bucket it says connection to server refused! have looked up Google exhaustively but cannot get an answer. they say that the port may be blocked but have checked that 9000-9001 are not blocked could it be an autghentication issue? just saying ... out of ideas. Find the call trace attached below: ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina * 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data] 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 time(s). 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 time(s). 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 time(s). 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 time(s). 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 time(s). 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 time(s). 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 time(s). 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 time(s). 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 time(s). 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 time(s). With failures, global counters are inaccurate; consider running with -i Copy failed: java.net.ConnectException: Call to domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635) at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656) at
[jira] [Created] (HADOOP-8571) Improve resource cleaning when shutting down
Guillaume Nodet created HADOOP-8571: --- Summary: Improve resource cleaning when shutting down Key: HADOOP-8571 URL: https://issues.apache.org/jira/browse/HADOOP-8571 Project: Hadoop Common Issue Type: Improvement Reporter: Guillaume Nodet -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: HDFS to S3 copy issues
hi hdfs is running on just one node and i get the same connection refused error no matter if i try with the node's private DNS or with localhost. i do have 127.0.0.1 localhost in my /etc/hosts file thanks in advance! momina On Fri, Jul 6, 2012 at 12:22 PM, feng lu amuseme...@gmail.com wrote: hi Momina maybe the problem is your DNS Resolution. You must have IP hostname enteries if all nodes in /etc/hosts file. like this 127.0.0.1 localhost On Fri, Jul 6, 2012 at 2:49 PM, Momina Khan momina.a...@gmail.com wrote: hi Ivan, i have tried with both ports 9000 and 9001 i get the same error dump ... best momina On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote: Hi Momina, Could it be that you misspelled the port in your source path, you mind trying with: hdfs://10.240.113.162:9000/data/ Ivan -Original Message- From: Momina Khan [mailto:momina.a...@gmail.com] Sent: Thursday, July 05, 2012 10:30 PM To: common-dev@hadoop.apache.org Subject: HDFS to S3 copy issues hi ... hope someone is able to help me out with this ... have tried an exhaustive search of google and AWS forum but there is little help in this regard and all that i found didnt work for me! i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url is correct i tried the fs -cat command which works just fine ... spits contents of the file ubuntu@domU-12-31-39-04-6E-58 :/state/partition1/hadoop-1.0.1$ *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt* but when i try to distance copy the file from hdfs (same location as above) to my s3 bucket it says connection to server refused! have looked up Google exhaustively but cannot get an answer. they say that the port may be blocked but have checked that 9000-9001 are not blocked could it be an autghentication issue? just saying ... out of ideas. Find the call trace attached below: ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina * 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data] 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 time(s). 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 time(s). 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 time(s). 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 time(s). 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 time(s). 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 time(s). 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 time(s). 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 time(s). 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 time(s). 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server: domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 time(s). With failures, global counters are inaccurate; consider running with -i Copy failed: java.net.ConnectException: Call to domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095) at org.apache.hadoop.ipc.Client.call(Client.java:1071) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) at $Proxy1.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386) at
[jira] [Created] (HADOOP-8572) Have the ability to force the use of the login user
Guillaume Nodet created HADOOP-8572: --- Summary: Have the ability to force the use of the login user Key: HADOOP-8572 URL: https://issues.apache.org/jira/browse/HADOOP-8572 Project: Hadoop Common Issue Type: Improvement Reporter: Guillaume Nodet In Karaf, most of the code is run under the karaf user. When a user ssh into Karaf, commands will be executed under that user. Deploying hadoop inside Karaf requires that the authenticated Subject has the required hadoop principals set, which forces the reconfiguration of the whole security layer, even at dev time. My patch proposes the introduction of a new configuration property {{hadoop.security.force.login.user}} which if set to true (it would default to false to keep the current behavior), would force the use of the login user instead of using the authenticated subject (which is what happen when there's no authenticated subject at all). This greatly simplifies the use of hadoop in such environments where security isn't really needed (at dev time). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: No mapred-site.xml in the hadoop-0.23.3 distribution
That may be something that we missed, as I have been providing my own marped-site.xml for quite a while now. Have you tried it with branch-2 or trunk to see if they are providing it? In either case it is just going to be a template for you to fill in, but it would be nice to package that template for our users to follow. If you want to file a JIRA for that it would be good, but I don't know how quickly we will be able to get around to doing it. --Bobby Evans On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote: Hi, I downloaded the Hadoop-0.23.3 source and tweaked a few classes and when I built the binary distribution and untar'd it .I don't see the mapred-site.xml file in the /etc/hadoop directory. But by the details given on how to run the Hadoop-0.23.3 the mapred-site.xml needs to be configured right? So I was just wondering if we are supposed to create the mapred-site.xml , or it doesn't exist at all? Thanks -- --With Regards Pavan Kulkarni
[jira] [Created] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.
Robert Joseph Evans created HADOOP-8573: --- Summary: Configuration tries to read from an inputstream resource multiple times. Key: HADOOP-8573 URL: https://issues.apache.org/jira/browse/HADOOP-8573 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0 Reporter: Robert Joseph Evans Assignee: Robert Joseph Evans If someone calls Configuration.addResource(InputStream) and then reloadConfiguration is called for any reason, Configruation will try to reread the contents of the InputStream, after it has already closed it. This never showed up in 1.0 because the framework itself does not call addResource with an InputStream, and typically by the time user code starts running that might call this, all of the default and site resources have already been loaded. In 0.23 mapreduce is now a client library, and mapred-site.xml and mapred-default.xml are loaded much later in the process. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8574) Enable starting hadoop services from inside OSGi
Guillaume Nodet created HADOOP-8574: --- Summary: Enable starting hadoop services from inside OSGi Key: HADOOP-8574 URL: https://issues.apache.org/jira/browse/HADOOP-8574 Project: Hadoop Common Issue Type: New Feature Reporter: Guillaume Nodet This JIRA captures the needed things in order to start hadoop services in OSGi. The main idea I used so far consists in: * using the OSGi ConfigAdmin to store the hadoop configuration * in that configuration, use a few boolean properties to determine which services should be started (nameNode, dataNode ...) * expose a configured url handler so that the whole OSGi runtime can use urls in hdfs:/xxx * the use of an OSGi ManagedService means that when the configuration changes, the services will be stopped and restarted with the new configuration -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HADOOP-8575) No mapred-site.xml present in the configuration directory. This is very trivial but thought would be less confusing for a new user if it came packaged.
Pavan Kulkarni created HADOOP-8575: -- Summary: No mapred-site.xml present in the configuration directory. This is very trivial but thought would be less confusing for a new user if it came packaged. Key: HADOOP-8575 URL: https://issues.apache.org/jira/browse/HADOOP-8575 Project: Hadoop Common Issue Type: Bug Components: conf Affects Versions: 0.23.1, 0.23.0 Environment: Linux Reporter: Pavan Kulkarni Priority: Minor Fix For: 0.23.2, 0.23.3 The binary Distribution of the hadoop-0.23.3 has no mapred-site.xml file in the /etc/hadoop directory. And for the setting up the cluster we need to configure mapred-site.xml. Though this is trivial issue but new users might get confused while configuring. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: No mapred-site.xml in the hadoop-0.23.3 distribution
Bobby, Thanks a lot for your clarification. Yes as you said it is just a template, but it may be quite confusing to new users while configuring. I have raised the Issue https://issues.apache.org/jira/browse/HADOOP-8575, in case you might want to have a look.Thanks Also I wanted to know is there a good source where I can look upto for running a multi-node 2nd generation Hadoop ? All I find is 1st generation Hadoop setup. On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans ev...@yahoo-inc.com wrote: That may be something that we missed, as I have been providing my own marped-site.xml for quite a while now. Have you tried it with branch-2 or trunk to see if they are providing it? In either case it is just going to be a template for you to fill in, but it would be nice to package that template for our users to follow. If you want to file a JIRA for that it would be good, but I don't know how quickly we will be able to get around to doing it. --Bobby Evans On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote: Hi, I downloaded the Hadoop-0.23.3 source and tweaked a few classes and when I built the binary distribution and untar'd it .I don't see the mapred-site.xml file in the /etc/hadoop directory. But by the details given on how to run the Hadoop-0.23.3 the mapred-site.xml needs to be configured right? So I was just wondering if we are supposed to create the mapred-site.xml , or it doesn't exist at all? Thanks -- --With Regards Pavan Kulkarni -- --With Regards Pavan Kulkarni
Re: No mapred-site.xml in the hadoop-0.23.3 distribution
Sorry I don't know of a good source for that right now. Perhaps others on the list might know better then I do. On 7/6/12 12:05 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote: Bobby, Thanks a lot for your clarification. Yes as you said it is just a template, but it may be quite confusing to new users while configuring. I have raised the Issue https://issues.apache.org/jira/browse/HADOOP-8575, in case you might want to have a look.Thanks Also I wanted to know is there a good source where I can look upto for running a multi-node 2nd generation Hadoop ? All I find is 1st generation Hadoop setup. On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans ev...@yahoo-inc.com wrote: That may be something that we missed, as I have been providing my own marped-site.xml for quite a while now. Have you tried it with branch-2 or trunk to see if they are providing it? In either case it is just going to be a template for you to fill in, but it would be nice to package that template for our users to follow. If you want to file a JIRA for that it would be good, but I don't know how quickly we will be able to get around to doing it. --Bobby Evans On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote: Hi, I downloaded the Hadoop-0.23.3 source and tweaked a few classes and when I built the binary distribution and untar'd it .I don't see the mapred-site.xml file in the /etc/hadoop directory. But by the details given on how to run the Hadoop-0.23.3 the mapred-site.xml needs to be configured right? So I was just wondering if we are supposed to create the mapred-site.xml , or it doesn't exist at all? Thanks -- --With Regards Pavan Kulkarni -- --With Regards Pavan Kulkarni
[CVE-2012-3376] Apache Hadoop HDFS information disclosure vulnerability
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello, Users of Apache Hadoop should be aware of a security vulnerability recently discovered, as described by the following CVE. In particular, please note the Users affected, Versions affected, and Mitigation sections. The project team will be announcing a release vote shortly for Apache Hadoop 2.0.1-alpha, which will be comprised of the contents of Apache Hadoop 2.0.0-alpha, this security patch, and a few patches for YARN. Best, Aaron T. Myers Software Engineer, Cloudera CVE-2012-3376: Apache Hadoop HDFS information disclosure vulnerability Severity: Critical Vendor: The Apache Software Foundation Versions Affected: Hadoop 2.0.0-alpha Users affected: Users who have enabled Hadoop's Kerberos/HDFS security features. Impact: Malicious clients may gain write access to data for which they have read-only permission, or gain read access to any data blocks whose IDs they can determine. Description: When Hadoop's security features are enabled, clients authenticate to DataNodes using BlockTokens issued by the NameNode to the client. The DataNodes are able to verify the validity of a BlockToken, and will reject BlockTokens that were not issued by the NameNode. The DataNode determines whether or not it should check for BlockTokens when it registers with the NameNode. Due to a bug in the DataNode/NameNode registration process, a DataNode which registers more than once for the same block pool will conclude that it thereafter no longer needs to check for BlockTokens sent by clients. That is, the client will continue to send BlockTokens as part of its communication with DataNodes, but the DataNodes will not check the validity of the tokens. A DataNode will register more than once for the same block pool whenever the NameNode restarts, or when HA is enabled. Mitigation: Users of 2.0.0-alpha should immediately apply the patch provided below to their systems. Users should upgrade to 2.0.1-alpha as soon as it becomes available. Credit: This issue was discovered by Aaron T. Myers of Cloudera. A signed patch against Apache Hadoop 2.0.0-alpha for this issue can be found here: https://people.apache.org/~atm/cve-2012-3376/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBAgAGBQJP9xp7AAoJECEaGfB4kTjfGWMH/2fXnrngfpQL+d1QLG3wDOPn OAJK3Tj/JrII1ETCguI6DOjpQaRrnzSvyCdWOHApbGG6LxwSvTlwEBPUR8SMZFxY TA13eJPz21ZXtXZ9oGvg1BMw+wRwfmem0Sl508c8kJpSfHXD4W89wyG/5Z+1pz5d s0aHUMVj5YT32yH45Tp192nB5d4XQ7gdUmCLB4HF8fxrrIH2jWU0NX63DT6dXE5w DUqKq6nTFDHnuTA1IO0B8OAVGv2M/kq8P3Fi+pnVvVao+ttkWIK1z7Ts11gfL7d0 /rE4VgZ7Cwc2o1Fx8s1LCKKLIDrO15aULOSbEa9nl6yQywEEjn2h6cKXHv6RUHM= =wrvr -END PGP SIGNATURE-
[jira] [Resolved] (HADOOP-8523) test-patch.sh doesn't validate patches before building
[ https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles resolved HADOOP-8523. - Resolution: Fixed Fix Version/s: 3.0.0 2.0.1-alpha test-patch.sh doesn't validate patches before building -- Key: HADOOP-8523 URL: https://issues.apache.org/jira/browse/HADOOP-8523 Project: Hadoop Common Issue Type: Improvement Components: build Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jack Dintruff Priority: Minor Labels: newbie Fix For: 2.0.1-alpha, 3.0.0 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, Hadoop-8523.patch When running test-patch.sh with an invalid patch (not formatted properly) or one that doesn't compile, the script spends a lot of time building Hadoop before checking to see if the patch is invalid. It would help devs if it checked first just in case we run test-patch.sh with a bad patch file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [ANNOUNCE] Hadoop branch-1.1 and Release Plan for Hadoop-1.1.0
Hi Cos, the query string didn't come thru on the link you sent, but the jira query I use is: project in (HADOOP,HDFS,MAPREDUCE) and ((Target Version/s = '1.1.0' and (fixVersion != '1.1.0' or fixVersion is EMPTY)) or (fixVersion = '1.1.0' and Target Version/s is EMPTY)) and (status != Closed and status != Resolved) ORDER BY KEY You're correct that there are quite a few, currently 107, open jiras originally targeted for 1.1.0 that do not have committed fixes. Many of these are just the inherited backlog of previously identified work. I need to move them to Target Version/s = 1.1.1. Folks have requested that the following currently open jiras be included in 1.1.0: HADOOP-8417 - HADOOP-6963 didn't update hadoop-core-pom-template.xml HADOOP-8445 - Token should not print the password in toString HDFS-96 - HDFS does not support blocks greater than 2GB HDFS-3596 - Improve FSEditLog pre-allocation in branch-1 MAPREDUCE-2903 - Map Tasks graph is throwing XML Parse error when Job is executed with 0 maps MAPREDUCE-2129 - Job may hang if mapreduce.job.committer.setup.cleanup.needed=false and mapreduce.map/reduce.failures.maxpercent0 --- MAPREDUCE-4049 - plugin for generic shuffle service HADOOP-7823 - port HADOOP-4012 to branch-1 (splitting support for bzip2) The first six are simple patches that I am comfortable including. The last two are complex patches that have not yet been committed. I am planning to defer those two to 1.1.1. Beyond that, I'm going to cut 1.1.0-rc0 from the current state of branch-1.1. I'm planning to do that this weekend. This is obviously delayed from the previous plan, for which I apologize. Comments welcome. --Matt On Tue, Jul 3, 2012 at 8:32 PM, Konstantin Boudnik c...@apache.org wrote: Hi Matt. I am picking up the hat of BigTop's maintainer for Hadoop 1.x line. And I wanted to sync up with about the Hadoop 1.1 release outlook, progress, what help you might need, etc. I see a few jiras left open in the release http://is.gd/OyuaNQ Is this the correct representation of the current status? How I can help from BigTop side (I haven't yet finalized the stack's versions), etc. Looking forward for your input. Thanks. Cos On Fri, May 25, 2012 at 02:49PM, Matt Foley wrote: Greetings. With the approval of a public vote on common-dev@, I have branched Hadoop branch-1 to create branch-1.1. From this, I will create a release candidate RC-0 for Hadoop-1.1.0, hopefully to be available shortly after this weekend. There are over 80 patches in branch-1, over and above the contents of hadoop-1.0.3. So I anticipate that some stabilization will be needed, before the RC can be approved as a 1.1.0 release. Your participation in assuring a stable RC is very important. When it becomes available, please download it and work with it to determine whether it is stable enough to release, and report issues found. My colleagues and I will do likewise, of course, but no one company can adequately exercise a new release with this many new contributions. There are two outstanding issue that are not yet committed, but I know the contributors hope to see in 1.1.0: MAPREDUCE-4049 https://issues.apache.org/jira/browse/MAPREDUCE-4049 HADOOP-4012 https://issues.apache.org/jira/browse/HADOOP-4012 Assuming there is an RC-1, and that these two patches can be committed during stabilization of RC-0, I will plan to incorporate these additional items in RC-1. Best regards, --Matt Release Manager
Re: No mapred-site.xml in the hadoop-0.23.3 distribution
Hi Robert, Can you please share what all configuration files are mandatory for the hadoop-0.23.3 to work. I am tuning a few but still not able to set it up completely.Thanks On Fri, Jul 6, 2012 at 10:24 AM, Robert Evans ev...@yahoo-inc.com wrote: Sorry I don't know of a good source for that right now. Perhaps others on the list might know better then I do. On 7/6/12 12:05 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote: Bobby, Thanks a lot for your clarification. Yes as you said it is just a template, but it may be quite confusing to new users while configuring. I have raised the Issue https://issues.apache.org/jira/browse/HADOOP-8575, in case you might want to have a look.Thanks Also I wanted to know is there a good source where I can look upto for running a multi-node 2nd generation Hadoop ? All I find is 1st generation Hadoop setup. On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans ev...@yahoo-inc.com wrote: That may be something that we missed, as I have been providing my own marped-site.xml for quite a while now. Have you tried it with branch-2 or trunk to see if they are providing it? In either case it is just going to be a template for you to fill in, but it would be nice to package that template for our users to follow. If you want to file a JIRA for that it would be good, but I don't know how quickly we will be able to get around to doing it. --Bobby Evans On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote: Hi, I downloaded the Hadoop-0.23.3 source and tweaked a few classes and when I built the binary distribution and untar'd it .I don't see the mapred-site.xml file in the /etc/hadoop directory. But by the details given on how to run the Hadoop-0.23.3 the mapred-site.xml needs to be configured right? So I was just wondering if we are supposed to create the mapred-site.xml , or it doesn't exist at all? Thanks -- --With Regards Pavan Kulkarni -- --With Regards Pavan Kulkarni -- --With Regards Pavan Kulkarni
[jira] [Resolved] (HADOOP-8554) KerberosAuthenticator should use the configured principal
[ https://issues.apache.org/jira/browse/HADOOP-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins resolved HADOOP-8554. - Resolution: Invalid You're right, thanks for the explanation, I didn't realize the principal config was server-side only. Also, the reason I hit this with webhdfs and not hftp is that hftp doesn't support SPNEGO. KerberosAuthenticator should use the configured principal - Key: HADOOP-8554 URL: https://issues.apache.org/jira/browse/HADOOP-8554 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 1.0.0, 2.0.0-alpha, 2.0.1-alpha, 3.0.0 Reporter: Eli Collins Labels: security, webconsole In KerberosAuthenticator we construct the principal as follows: {code} String servicePrincipal = HTTP/ + KerberosAuthenticator.this.url.getHost(); {code} Seems like we should use the configured hadoop.http.authentication.kerberos.principal instead right? I hit this issue as a distcp using webhdfs://localhost fails because HTTP/localhost is not in the kerb DB but using webhdfs://eli-thinkpad works because HTTP/eli-thinkpad is (and is my configured principal). distcp using Hftp://localhost with the same config works so it looks like this check is webhdfs specific for some reason (webhdfs is using spnego and hftp is not?). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira