RE: HDFS to S3 copy issues

2012-07-06 Thread Ivan Mitic
Hi Momina,

Could it be that you misspelled the port in your source path, you mind trying 
with: hdfs://10.240.113.162:9000/data/ 

Ivan

-Original Message-
From: Momina Khan [mailto:momina.a...@gmail.com] 
Sent: Thursday, July 05, 2012 10:30 PM
To: common-dev@hadoop.apache.org
Subject: HDFS to S3 copy issues

hi ... hope someone is able to help me out with this ... have tried an 
exhaustive search of google and AWS forum but there is little help in this 
regard and all that i found didnt work for me!

i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS url 
is correct i tried the fs -cat command which works just fine ... spits contents 
of the file ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ 
*bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt*

but when i try to distance copy the file from hdfs (same location as above) to 
my s3 bucket it says connection to server refused! have looked up Google 
exhaustively but cannot get an answer. they say that the port may be blocked 
but have checked that 9000-9001 are not blocked  could it be an 
autghentication issue? just saying ... out of ideas.

Find the call trace attached below:

ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop 
distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina
*

12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs:// 10.240.113.162:9001/data]
12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina

12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 0 
time(s).
12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 1 
time(s).
12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 2 
time(s).
12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 3 
time(s).
12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 4 
time(s).
12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 5 
time(s).
12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 6 
time(s).
12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 7 
time(s).
12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 8 
time(s).
12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server:
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already tried 9 
time(s).
With failures, global counters are inaccurate; consider running with -i Copy 
failed: java.net.ConnectException: Call to
domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on 
connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
at org.apache.hadoop.ipc.Client.call(Client.java:1071)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at 

[jira] [Created] (HADOOP-8569) CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE

2012-07-06 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HADOOP-8569:


 Summary: CMakeLists.txt: define _GNU_SOURCE and _LARGEFILE_SOURCE
 Key: HADOOP-8569
 URL: https://issues.apache.org/jira/browse/HADOOP-8569
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


In the native code, we should define _GNU_SOURCE and _LARGEFILE_SOURCE so that 
all of the functions on Linux are available.

_LARGEFILE enables fseeko and ftello; _GNU_SOURCE enables a variety of 
Linux-specific functions from glibc, including sync_file_range.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: HDFS to S3 copy issues

2012-07-06 Thread Momina Khan
hi Ivan,

i have tried with both ports 9000 and 9001 i get the same error dump ...

best
momina

On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote:

 Hi Momina,

 Could it be that you misspelled the port in your source path, you mind
 trying with: hdfs://10.240.113.162:9000/data/

 Ivan

 -Original Message-
 From: Momina Khan [mailto:momina.a...@gmail.com]
 Sent: Thursday, July 05, 2012 10:30 PM
 To: common-dev@hadoop.apache.org
 Subject: HDFS to S3 copy issues

 hi ... hope someone is able to help me out with this ... have tried an
 exhaustive search of google and AWS forum but there is little help in this
 regard and all that i found didnt work for me!

 i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS
 url is correct i tried the fs -cat command which works just fine ... spits
 contents of the file 
 ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$
 *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt*

 but when i try to distance copy the file from hdfs (same location as
 above) to my s3 bucket it says connection to server refused! have looked up
 Google exhaustively but cannot get an answer. they say that the port may be
 blocked but have checked that 9000-9001 are not blocked  could it be an
 autghentication issue? just saying ... out of ideas.

 Find the call trace attached below:

 ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$ *bin/hadoop
 distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina
 *

 12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs://
 10.240.113.162:9001/data]
 12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina

 12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 0 time(s).
 12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 1 time(s).
 12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 2 time(s).
 12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 3 time(s).
 12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 4 time(s).
 12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 5 time(s).
 12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 6 time(s).
 12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 7 time(s).
 12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 8 time(s).
 12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server:
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
 tried 9 time(s).
 With failures, global counters are inaccurate; consider running with -i
 Copy failed: java.net.ConnectException: Call to
 domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on
 connection exception: java.net.ConnectException: Connection refused
 at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
 at org.apache.hadoop.ipc.Client.call(Client.java:1071)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy1.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at

 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
 at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
 at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
 at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
 at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
 Caused by: java.net.ConnectException: Connection refused
 at 

Re: HDFS to S3 copy issues

2012-07-06 Thread Nitin Pawar
you may want to try following command

instead of using hdfs try hftp

hadoop -i -ppgu -log /tmp/mylog -m 20 distcp hftp://servername:port/path
 (hdfs://target.server:port/path | s3://id:sercret@domain)

On Fri, Jul 6, 2012 at 12:19 PM, Momina Khan momina.a...@gmail.com wrote:

 hi Ivan,

 i have tried with both ports 9000 and 9001 i get the same error dump ...

 best
 momina

 On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote:

  Hi Momina,
 
  Could it be that you misspelled the port in your source path, you mind
  trying with: hdfs://10.240.113.162:9000/data/
 
  Ivan
 
  -Original Message-
  From: Momina Khan [mailto:momina.a...@gmail.com]
  Sent: Thursday, July 05, 2012 10:30 PM
  To: common-dev@hadoop.apache.org
  Subject: HDFS to S3 copy issues
 
  hi ... hope someone is able to help me out with this ... have tried an
  exhaustive search of google and AWS forum but there is little help in
 this
  regard and all that i found didnt work for me!
 
  i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS
  url is correct i tried the fs -cat command which works just fine ...
 spits
  contents of the file ubuntu@domU-12-31-39-04-6E-58
 :/state/partition1/hadoop-1.0.1$
  *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt*
 
  but when i try to distance copy the file from hdfs (same location as
  above) to my s3 bucket it says connection to server refused! have looked
 up
  Google exhaustively but cannot get an answer. they say that the port may
 be
  blocked but have checked that 9000-9001 are not blocked  could it be
 an
  autghentication issue? just saying ... out of ideas.
 
  Find the call trace attached below:
 
  ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$
 *bin/hadoop
  distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina
  *
 
  12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs://
  10.240.113.162:9001/data]
  12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina
 
  12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 0 time(s).
  12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 1 time(s).
  12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 2 time(s).
  12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 3 time(s).
  12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 4 time(s).
  12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 5 time(s).
  12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 6 time(s).
  12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 7 time(s).
  12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 8 time(s).
  12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 9 time(s).
  With failures, global counters are inaccurate; consider running with -i
  Copy failed: java.net.ConnectException: Call to
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on
  connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
  at org.apache.hadoop.ipc.Client.call(Client.java:1071)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
  at $Proxy1.getProtocolVersion(Unknown Source)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
  at
  org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
  at
 
 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
  at
  org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
  at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
  at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
  at 

[jira] [Created] (HADOOP-8570) Bzip2Codec should accept .bz files too

2012-07-06 Thread Harsh J (JIRA)
Harsh J created HADOOP-8570:
---

 Summary: Bzip2Codec should accept .bz files too
 Key: HADOOP-8570
 URL: https://issues.apache.org/jira/browse/HADOOP-8570
 Project: Hadoop Common
  Issue Type: Improvement
  Components: io
Affects Versions: 2.0.0-alpha, 1.0.0
Reporter: Harsh J


The default extension reported for Bzip2Codec today is .bz2. This causes it 
not to pick up .bz files as Bzip2Codec files. Although the extension is not 
very popular today, it is still mentioned as a valid extension in the bunzip 
manual and we should support it.

We should change the Bzip2Codec default extension to bz, or we should add in 
a new extension list support to allow for better detection across various 
aliases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: HDFS to S3 copy issues

2012-07-06 Thread feng lu
hi Momina

maybe the problem is your DNS Resolution. You must have IP hostname
enteries if all nodes in /etc/hosts file. like this

127.0.0.1 localhost


On Fri, Jul 6, 2012 at 2:49 PM, Momina Khan momina.a...@gmail.com wrote:

hi Ivan,

 i have tried with both ports 9000 and 9001 i get the same error dump ...

 best
 momina

 On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com wrote:

  Hi Momina,
 
  Could it be that you misspelled the port in your source path, you mind
  trying with: hdfs://10.240.113.162:9000/data/
 
  Ivan
 
  -Original Message-
  From: Momina Khan [mailto:momina.a...@gmail.com]
  Sent: Thursday, July 05, 2012 10:30 PM
  To: common-dev@hadoop.apache.org
  Subject: HDFS to S3 copy issues
 
  hi ... hope someone is able to help me out with this ... have tried an
  exhaustive search of google and AWS forum but there is little help in
 this
  regard and all that i found didnt work for me!
 
  i want to copy data from HDFS to my S3 bucket ... to test whether my HDFS
  url is correct i tried the fs -cat command which works just fine ...
 spits
  contents of the file ubuntu@domU-12-31-39-04-6E-58
 :/state/partition1/hadoop-1.0.1$
  *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt*
 
  but when i try to distance copy the file from hdfs (same location as
  above) to my s3 bucket it says connection to server refused! have looked
 up
  Google exhaustively but cannot get an answer. they say that the port may
 be
  blocked but have checked that 9000-9001 are not blocked  could it be
 an
  autghentication issue? just saying ... out of ideas.
 
  Find the call trace attached below:
 
  ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$
 *bin/hadoop
  distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina
  *
 
  12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs://
  10.240.113.162:9001/data]
  12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina
 
  12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 0 time(s).
  12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 1 time(s).
  12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 2 time(s).
  12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 3 time(s).
  12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 4 time(s).
  12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 5 time(s).
  12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 6 time(s).
  12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 7 time(s).
  12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 8 time(s).
  12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server:
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
  tried 9 time(s).
  With failures, global counters are inaccurate; consider running with -i
  Copy failed: java.net.ConnectException: Call to
  domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed on
  connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
  at org.apache.hadoop.ipc.Client.call(Client.java:1071)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
  at $Proxy1.getProtocolVersion(Unknown Source)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
  at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
  at
  org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
  at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
  at
 
 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
  at
  org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
  at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
  at org.apache.hadoop.tools.DistCp.checkSrcPath(DistCp.java:635)
  at org.apache.hadoop.tools.DistCp.copy(DistCp.java:656)
  at 

[jira] [Created] (HADOOP-8571) Improve resource cleaning when shutting down

2012-07-06 Thread Guillaume Nodet (JIRA)
Guillaume Nodet created HADOOP-8571:
---

 Summary: Improve resource cleaning when shutting down
 Key: HADOOP-8571
 URL: https://issues.apache.org/jira/browse/HADOOP-8571
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: HDFS to S3 copy issues

2012-07-06 Thread Momina Khan
hi

hdfs is running on just one node  and i get the same connection refused
error no matter if i try with the node's private DNS or with localhost. i
do have
127.0.0.1 localhost in my /etc/hosts file

thanks in advance!
momina

On Fri, Jul 6, 2012 at 12:22 PM, feng lu amuseme...@gmail.com wrote:

 hi Momina

 maybe the problem is your DNS Resolution. You must have IP hostname
 enteries if all nodes in /etc/hosts file. like this

 127.0.0.1 localhost


 On Fri, Jul 6, 2012 at 2:49 PM, Momina Khan momina.a...@gmail.com wrote:

 hi Ivan,
 
  i have tried with both ports 9000 and 9001 i get the same error dump ...
 
  best
  momina
 
  On Fri, Jul 6, 2012 at 11:01 AM, Ivan Mitic iva...@microsoft.com
 wrote:
 
   Hi Momina,
  
   Could it be that you misspelled the port in your source path, you mind
   trying with: hdfs://10.240.113.162:9000/data/
  
   Ivan
  
   -Original Message-
   From: Momina Khan [mailto:momina.a...@gmail.com]
   Sent: Thursday, July 05, 2012 10:30 PM
   To: common-dev@hadoop.apache.org
   Subject: HDFS to S3 copy issues
  
   hi ... hope someone is able to help me out with this ... have tried an
   exhaustive search of google and AWS forum but there is little help in
  this
   regard and all that i found didnt work for me!
  
   i want to copy data from HDFS to my S3 bucket ... to test whether my
 HDFS
   url is correct i tried the fs -cat command which works just fine ...
  spits
   contents of the file ubuntu@domU-12-31-39-04-6E-58
  :/state/partition1/hadoop-1.0.1$
   *bin/hadoop fs -cat hdfs://10.240.113.162:9000/data/hello.txt*
  
   but when i try to distance copy the file from hdfs (same location as
   above) to my s3 bucket it says connection to server refused! have
 looked
  up
   Google exhaustively but cannot get an answer. they say that the port
 may
  be
   blocked but have checked that 9000-9001 are not blocked  could it
 be
  an
   autghentication issue? just saying ... out of ideas.
  
   Find the call trace attached below:
  
   ubuntu@domU-12-31-39-04-6E-58:/state/partition1/hadoop-1.0.1$
  *bin/hadoop
   distcp hdfs://10.240.113.162:9001/data/ s3://ID:**SECRET@momina
   *
  
   12/07/05 12:48:37 INFO tools.DistCp: srcPaths=[hdfs://
   10.240.113.162:9001/data]
   12/07/05 12:48:37 INFO tools.DistCp: destPath=s3://ID:SECRET@momina
  
   12/07/05 12:48:38 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 0 time(s).
   12/07/05 12:48:39 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 1 time(s).
   12/07/05 12:48:40 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 2 time(s).
   12/07/05 12:48:41 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 3 time(s).
   12/07/05 12:48:42 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 4 time(s).
   12/07/05 12:48:43 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 5 time(s).
   12/07/05 12:48:44 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 6 time(s).
   12/07/05 12:48:45 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 7 time(s).
   12/07/05 12:48:46 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 8 time(s).
   12/07/05 12:48:47 INFO ipc.Client: Retrying connect to server:
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001. Already
   tried 9 time(s).
   With failures, global counters are inaccurate; consider running with -i
   Copy failed: java.net.ConnectException: Call to
   domU-12-31-39-04-6E-58.compute-1.internal/10.240.113.162:9001 failed
 on
   connection exception: java.net.ConnectException: Connection refused
   at org.apache.hadoop.ipc.Client.wrapException(Client.java:1095)
   at org.apache.hadoop.ipc.Client.call(Client.java:1071)
   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
   at $Proxy1.getProtocolVersion(Unknown Source)
   at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
   at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
   at
   org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
   at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
   at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
   at
  
  
 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
   at
   org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
   at 

[jira] [Created] (HADOOP-8572) Have the ability to force the use of the login user

2012-07-06 Thread Guillaume Nodet (JIRA)
Guillaume Nodet created HADOOP-8572:
---

 Summary: Have the ability to force the use of the login user 
 Key: HADOOP-8572
 URL: https://issues.apache.org/jira/browse/HADOOP-8572
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Guillaume Nodet


In Karaf, most of the code is run under the karaf user. When a user ssh into 
Karaf, commands will be executed under that user.
Deploying hadoop inside Karaf requires that the authenticated Subject has the 
required hadoop principals set, which forces the reconfiguration of the whole 
security layer, even at dev time.

My patch proposes the introduction of a new configuration property 
{{hadoop.security.force.login.user}} which if set to true (it would default to 
false to keep the current behavior), would force the use of the login user 
instead of using the authenticated subject (which is what happen when there's 
no authenticated subject at all).  This greatly simplifies the use of hadoop in 
such environments where security isn't really needed (at dev time).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-06 Thread Robert Evans
That may be something that we missed, as I have been providing my own
marped-site.xml for quite a while now.  Have you tried it with branch-2 or
trunk to see if they are providing it?  In either case it is just going to
be a template for you to fill in, but it would be nice to package that
template for our users to follow.  If you want to file a JIRA for that it
would be good, but I don't know how quickly we will be able to get around
to doing it.

--Bobby Evans

On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote:

Hi,

  I downloaded the Hadoop-0.23.3 source and tweaked a few classes and
when I built the binary distribution and untar'd it .I don't see the
mapred-site.xml
file in the /etc/hadoop directory. But by the details given on how to run
the
Hadoop-0.23.3 the mapred-site.xml needs to be configured right?

  So I was just wondering if we are supposed to create the mapred-site.xml
, or
it doesn't exist at all? Thanks

-- 

--With Regards
Pavan Kulkarni



[jira] [Created] (HADOOP-8573) Configuration tries to read from an inputstream resource multiple times.

2012-07-06 Thread Robert Joseph Evans (JIRA)
Robert Joseph Evans created HADOOP-8573:
---

 Summary: Configuration tries to read from an inputstream resource 
multiple times. 
 Key: HADOOP-8573
 URL: https://issues.apache.org/jira/browse/HADOOP-8573
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 1.0.2, 0.23.3, 2.0.1-alpha, 3.0.0
Reporter: Robert Joseph Evans
Assignee: Robert Joseph Evans


If someone calls Configuration.addResource(InputStream) and then 
reloadConfiguration is called for any reason, Configruation will try to reread 
the contents of the InputStream, after it has already closed it.

This never showed up in 1.0 because the framework itself does not call 
addResource with an InputStream, and typically by the time user code starts 
running that might call this, all of the default and site resources have 
already been loaded.

In 0.23 mapreduce is now a client library, and mapred-site.xml and 
mapred-default.xml are loaded much later in the process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8574) Enable starting hadoop services from inside OSGi

2012-07-06 Thread Guillaume Nodet (JIRA)
Guillaume Nodet created HADOOP-8574:
---

 Summary: Enable starting hadoop services from inside OSGi
 Key: HADOOP-8574
 URL: https://issues.apache.org/jira/browse/HADOOP-8574
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Guillaume Nodet


This JIRA captures the needed things in order to start hadoop services in OSGi.

The main idea I used so far consists in:
  * using the OSGi ConfigAdmin to store the hadoop configuration
  * in that configuration, use a few boolean properties to determine which 
services should be started (nameNode, dataNode ...)
  * expose a configured url handler so that the whole OSGi runtime can use urls 
in hdfs:/xxx
  * the use of an OSGi ManagedService means that when the configuration 
changes, the services will be stopped and restarted with the new configuration



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8575) No mapred-site.xml present in the configuration directory. This is very trivial but thought would be less confusing for a new user if it came packaged.

2012-07-06 Thread Pavan Kulkarni (JIRA)
Pavan Kulkarni created HADOOP-8575:
--

 Summary: No mapred-site.xml present in the configuration 
directory. This is very trivial but thought would be less confusing for a new 
user if it came packaged.
 Key: HADOOP-8575
 URL: https://issues.apache.org/jira/browse/HADOOP-8575
 Project: Hadoop Common
  Issue Type: Bug
  Components: conf
Affects Versions: 0.23.1, 0.23.0
 Environment: Linux
Reporter: Pavan Kulkarni
Priority: Minor
 Fix For: 0.23.2, 0.23.3


The binary Distribution of the hadoop-0.23.3 has no mapred-site.xml file in the 
/etc/hadoop directory. 
And for the setting up the cluster we need to configure mapred-site.xml.
Though this is trivial issue but new users might get confused while configuring.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-06 Thread Pavan Kulkarni
Bobby,

  Thanks a lot for your clarification.
Yes as you said it is just a template, but it may
be quite confusing to new users while configuring.
I have raised the Issue https://issues.apache.org/jira/browse/HADOOP-8575,
in case you might want to
have a look.Thanks

 Also I wanted to know is there a good source where I can
look upto for running a multi-node 2nd generation Hadoop ?
All I find is 1st generation Hadoop setup.

On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans ev...@yahoo-inc.com wrote:

 That may be something that we missed, as I have been providing my own
 marped-site.xml for quite a while now.  Have you tried it with branch-2 or
 trunk to see if they are providing it?  In either case it is just going to
 be a template for you to fill in, but it would be nice to package that
 template for our users to follow.  If you want to file a JIRA for that it
 would be good, but I don't know how quickly we will be able to get around
 to doing it.

 --Bobby Evans

 On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote:

 Hi,
 
   I downloaded the Hadoop-0.23.3 source and tweaked a few classes and
 when I built the binary distribution and untar'd it .I don't see the
 mapred-site.xml
 file in the /etc/hadoop directory. But by the details given on how to run
 the
 Hadoop-0.23.3 the mapred-site.xml needs to be configured right?
 
   So I was just wondering if we are supposed to create the mapred-site.xml
 , or
 it doesn't exist at all? Thanks
 
 --
 
 --With Regards
 Pavan Kulkarni




-- 

--With Regards
Pavan Kulkarni


Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-06 Thread Robert Evans
Sorry I don't know of a good source for that right now.  Perhaps others on
the list might know better then I do.

On 7/6/12 12:05 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote:

Bobby,

  Thanks a lot for your clarification.
Yes as you said it is just a template, but it may
be quite confusing to new users while configuring.
I have raised the Issue
https://issues.apache.org/jira/browse/HADOOP-8575,
in case you might want to
have a look.Thanks

 Also I wanted to know is there a good source where I can
look upto for running a multi-node 2nd generation Hadoop ?
All I find is 1st generation Hadoop setup.

On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans ev...@yahoo-inc.com wrote:

 That may be something that we missed, as I have been providing my own
 marped-site.xml for quite a while now.  Have you tried it with branch-2
or
 trunk to see if they are providing it?  In either case it is just going
to
 be a template for you to fill in, but it would be nice to package that
 template for our users to follow.  If you want to file a JIRA for that
it
 would be good, but I don't know how quickly we will be able to get
around
 to doing it.

 --Bobby Evans

 On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote:

 Hi,
 
   I downloaded the Hadoop-0.23.3 source and tweaked a few classes and
 when I built the binary distribution and untar'd it .I don't see the
 mapred-site.xml
 file in the /etc/hadoop directory. But by the details given on how to
run
 the
 Hadoop-0.23.3 the mapred-site.xml needs to be configured right?
 
   So I was just wondering if we are supposed to create the
mapred-site.xml
 , or
 it doesn't exist at all? Thanks
 
 --
 
 --With Regards
 Pavan Kulkarni




-- 

--With Regards
Pavan Kulkarni



[CVE-2012-3376] Apache Hadoop HDFS information disclosure vulnerability

2012-07-06 Thread Aaron T. Myers
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

Users of Apache Hadoop should be aware of a security vulnerability recently
discovered, as described by the following CVE. In particular, please note the
Users affected, Versions affected, and Mitigation sections.

The project team will be announcing a release vote shortly for Apache Hadoop
2.0.1-alpha, which will be comprised of the contents of Apache Hadoop
2.0.0-alpha, this security patch, and a few patches for YARN.

Best,
Aaron T. Myers
Software Engineer, Cloudera

CVE-2012-3376: Apache Hadoop HDFS information disclosure vulnerability

Severity: Critical

Vendor: The Apache Software Foundation

Versions Affected: Hadoop 2.0.0-alpha

Users affected:
Users who have enabled Hadoop's Kerberos/HDFS security features.

Impact:
Malicious clients may gain write access to data for which they have read-only
permission, or gain read access to any data blocks whose IDs they can
determine.

Description:
When Hadoop's security features are enabled, clients authenticate to DataNodes
using BlockTokens issued by the NameNode to the client. The DataNodes are able
to verify the validity of a BlockToken, and will reject BlockTokens that were
not issued by the NameNode. The DataNode determines whether or not it should
check for BlockTokens when it registers with the NameNode.

Due to a bug in the DataNode/NameNode registration process, a DataNode which
registers more than once for the same block pool will conclude that it
thereafter no longer needs to check for BlockTokens sent by clients. That is,
the client will continue to send BlockTokens as part of its communication with
DataNodes, but the DataNodes will not check the validity of the tokens. A
DataNode will register more than once for the same block pool whenever the
NameNode restarts, or when HA is enabled.

Mitigation:
Users of 2.0.0-alpha should immediately apply the patch provided below to their
systems. Users should upgrade to 2.0.1-alpha as soon as it becomes available.

Credit: This issue was discovered by Aaron T. Myers of Cloudera.

A signed patch against Apache Hadoop 2.0.0-alpha for this issue can be found
here: https://people.apache.org/~atm/cve-2012-3376/

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJP9xp7AAoJECEaGfB4kTjfGWMH/2fXnrngfpQL+d1QLG3wDOPn
OAJK3Tj/JrII1ETCguI6DOjpQaRrnzSvyCdWOHApbGG6LxwSvTlwEBPUR8SMZFxY
TA13eJPz21ZXtXZ9oGvg1BMw+wRwfmem0Sl508c8kJpSfHXD4W89wyG/5Z+1pz5d
s0aHUMVj5YT32yH45Tp192nB5d4XQ7gdUmCLB4HF8fxrrIH2jWU0NX63DT6dXE5w
DUqKq6nTFDHnuTA1IO0B8OAVGv2M/kq8P3Fi+pnVvVao+ttkWIK1z7Ts11gfL7d0
/rE4VgZ7Cwc2o1Fx8s1LCKKLIDrO15aULOSbEa9nl6yQywEEjn2h6cKXHv6RUHM=
=wrvr
-END PGP SIGNATURE-


[jira] [Resolved] (HADOOP-8523) test-patch.sh doesn't validate patches before building

2012-07-06 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles resolved HADOOP-8523.
-

   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.1-alpha

 test-patch.sh doesn't validate patches before building
 --

 Key: HADOOP-8523
 URL: https://issues.apache.org/jira/browse/HADOOP-8523
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build
Affects Versions: 0.23.3, 2.0.1-alpha
Reporter: Jack Dintruff
Priority: Minor
  Labels: newbie
 Fix For: 2.0.1-alpha, 3.0.0

 Attachments: HADOOP-8523.patch, HADOOP-8523.patch, Hadoop-8523.patch, 
 Hadoop-8523.patch


 When running test-patch.sh with an invalid patch (not formatted properly) or 
 one that doesn't compile, the script spends a lot of time building Hadoop 
 before checking to see if the patch is invalid.  It would help devs if it 
 checked first just in case we run test-patch.sh with a bad patch file. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [ANNOUNCE] Hadoop branch-1.1 and Release Plan for Hadoop-1.1.0

2012-07-06 Thread Matt Foley
Hi Cos,
the query string didn't come thru on the link you sent, but the jira query
I use is:
project in (HADOOP,HDFS,MAPREDUCE) and ((Target Version/s = '1.1.0'
and (fixVersion != '1.1.0' or fixVersion is EMPTY)) or (fixVersion =
'1.1.0' and Target Version/s is EMPTY)) and (status != Closed and status
!= Resolved) ORDER BY KEY

You're correct that there are quite a few, currently 107, open jiras
originally targeted for 1.1.0 that do not have committed fixes.  Many of
these are just the inherited backlog of previously identified work.  I need
to move them to Target Version/s = 1.1.1.

Folks have requested that the following currently open jiras be included in
1.1.0:

HADOOP-8417 - HADOOP-6963 didn't update hadoop-core-pom-template.xml
HADOOP-8445 - Token should not print the password in toString
HDFS-96 - HDFS does not support blocks greater than 2GB
HDFS-3596 - Improve FSEditLog pre-allocation in branch-1
MAPREDUCE-2903 - Map Tasks graph is throwing XML Parse error when Job is
executed with 0 maps
MAPREDUCE-2129 - Job may hang if
mapreduce.job.committer.setup.cleanup.needed=false and
mapreduce.map/reduce.failures.maxpercent0
---
MAPREDUCE-4049 - plugin for generic shuffle service
HADOOP-7823 - port HADOOP-4012 to branch-1 (splitting support for bzip2)

The first six are simple patches that I am comfortable including.
The last two are complex patches that have not yet been committed.
I am planning to defer those two to 1.1.1.

Beyond that, I'm going to cut 1.1.0-rc0 from the current state of
branch-1.1.
I'm planning to do that this weekend.  This is obviously delayed from the
previous plan, for which I apologize.

Comments welcome.
--Matt


On Tue, Jul 3, 2012 at 8:32 PM, Konstantin Boudnik c...@apache.org wrote:

 Hi Matt.

 I am picking up the hat of BigTop's maintainer for Hadoop 1.x line. And I
 wanted to sync up with about the Hadoop 1.1 release outlook, progress,
 what help
 you might need, etc.

 I see a few jiras left open in the release
 http://is.gd/OyuaNQ
 Is this the correct representation of the current status?
 How I can help from BigTop side (I haven't yet finalized the stack's
 versions), etc. Looking forward for your input. Thanks.

   Cos

 On Fri, May 25, 2012 at 02:49PM, Matt Foley wrote:
  Greetings.  With the approval of a public vote on common-dev@, I have
  branched Hadoop branch-1 to create branch-1.1.  From this, I will create
 a
  release candidate RC-0 for Hadoop-1.1.0, hopefully to be available
 shortly
  after this weekend.
 
  There are over 80 patches in branch-1, over and above the contents of
  hadoop-1.0.3.  So I anticipate that some stabilization will be needed,
  before the RC can be approved as a 1.1.0 release.  Your participation in
  assuring a stable RC is very important.  When it becomes available,
 please
  download it and work with it to determine whether it is stable enough to
  release, and report issues found.  My colleagues and I will do likewise,
 of
  course, but no one company can adequately exercise a new release with
 this
  many new contributions.
 
  There are two outstanding issue that are not yet committed, but I know
 the
  contributors hope to see in 1.1.0:
  MAPREDUCE-4049 https://issues.apache.org/jira/browse/MAPREDUCE-4049
 
  HADOOP-4012 https://issues.apache.org/jira/browse/HADOOP-4012
  Assuming there is an RC-1, and that these two patches can be committed
  during stabilization of RC-0, I will plan to incorporate these additional
  items in RC-1.
 
  Best regards,
  --Matt
  Release Manager



Re: No mapred-site.xml in the hadoop-0.23.3 distribution

2012-07-06 Thread Pavan Kulkarni
Hi Robert,

 Can you please share what all configuration files are mandatory for the
hadoop-0.23.3 to work.
I am tuning a few but still not able to set it up completely.Thanks

On Fri, Jul 6, 2012 at 10:24 AM, Robert Evans ev...@yahoo-inc.com wrote:

 Sorry I don't know of a good source for that right now.  Perhaps others on
 the list might know better then I do.

 On 7/6/12 12:05 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote:

 Bobby,
 
   Thanks a lot for your clarification.
 Yes as you said it is just a template, but it may
 be quite confusing to new users while configuring.
 I have raised the Issue
 https://issues.apache.org/jira/browse/HADOOP-8575,
 in case you might want to
 have a look.Thanks
 
  Also I wanted to know is there a good source where I can
 look upto for running a multi-node 2nd generation Hadoop ?
 All I find is 1st generation Hadoop setup.
 
 On Fri, Jul 6, 2012 at 7:13 AM, Robert Evans ev...@yahoo-inc.com wrote:
 
  That may be something that we missed, as I have been providing my own
  marped-site.xml for quite a while now.  Have you tried it with branch-2
 or
  trunk to see if they are providing it?  In either case it is just going
 to
  be a template for you to fill in, but it would be nice to package that
  template for our users to follow.  If you want to file a JIRA for that
 it
  would be good, but I don't know how quickly we will be able to get
 around
  to doing it.
 
  --Bobby Evans
 
  On 7/5/12 7:23 PM, Pavan Kulkarni pavan.babu...@gmail.com wrote:
 
  Hi,
  
I downloaded the Hadoop-0.23.3 source and tweaked a few classes and
  when I built the binary distribution and untar'd it .I don't see the
  mapred-site.xml
  file in the /etc/hadoop directory. But by the details given on how to
 run
  the
  Hadoop-0.23.3 the mapred-site.xml needs to be configured right?
  
So I was just wondering if we are supposed to create the
 mapred-site.xml
  , or
  it doesn't exist at all? Thanks
  
  --
  
  --With Regards
  Pavan Kulkarni
 
 
 
 
 --
 
 --With Regards
 Pavan Kulkarni




-- 

--With Regards
Pavan Kulkarni


[jira] [Resolved] (HADOOP-8554) KerberosAuthenticator should use the configured principal

2012-07-06 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-8554.
-

Resolution: Invalid

You're right, thanks for the explanation, I didn't realize the principal config 
was server-side only. Also, the reason I hit this with webhdfs and not hftp is 
that hftp doesn't support SPNEGO.

 KerberosAuthenticator should use the configured principal
 -

 Key: HADOOP-8554
 URL: https://issues.apache.org/jira/browse/HADOOP-8554
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 1.0.0, 2.0.0-alpha, 2.0.1-alpha, 3.0.0
Reporter: Eli Collins
  Labels: security, webconsole

 In KerberosAuthenticator we construct the principal as follows:
 {code}
 String servicePrincipal = HTTP/ + KerberosAuthenticator.this.url.getHost();
 {code}
 Seems like we should use the configured 
 hadoop.http.authentication.kerberos.principal instead right?
 I hit this issue as a distcp using webhdfs://localhost fails because 
 HTTP/localhost is not in the kerb DB but using webhdfs://eli-thinkpad works 
 because HTTP/eli-thinkpad is (and is my configured principal). distcp using 
 Hftp://localhost with the same config works so it looks like this check is 
 webhdfs specific for some reason (webhdfs is using spnego and hftp is not?).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira