Re: Map task can't execute /bin/ls on solaris

2011-08-11 Thread Xiaobo Gu
Is HADOOP_HEAPSIZE set for all Hadoop related Java processes, or just
one Java process?

Regards,

Xiaobo Gu

On Thu, Aug 11, 2011 at 1:07 PM, Lance Norskog goks...@gmail.com wrote:
 If the server is dedicated to this job, you might as well give it
 10-15g. After that shakes out, try changing the number of mappers 
 reducers.

 On Tue, Aug 9, 2011 at 2:06 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:
 Hi Adi,

 Thanks for your response, on an SMP server with 32G RAM and 8 Cores,
 what's your suggestion for setting HADOOP_HEAPSIZE, the server will be
 dedicated for a Single Node Hadoop with 1 data node instance, and the
 it will run 4 mapper and reducer tasks .

 Regards,

 Xiaobo Gu


 On Sun, Aug 7, 2011 at 11:35 PM, Adi adi.pan...@gmail.com wrote:
Caused by: java.io.IOException: error=12, Not enough space

 You either do not have enough memory allocated to your hadoop daemons(via
 HADOOP_HEAPSIZE) or swap space.

 -Adi

 On Sun, Aug 7, 2011 at 5:48 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:

 Hi,

 I am trying to write a map-reduce job to convert csv files to
 sequencefiles, but the job fails with the following error:
 java.lang.RuntimeException: Error while running command to get file
 permissions : java.io.IOException: Cannot run program /bin/ls:
 error=12, Not enough space
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
        at org.apache.hadoop.util.Shell.run(Shell.java:182)
        at
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
        at
 org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540)
        at
 org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37)
        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:417)
        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400)
        at
 org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176)
        at
 org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:264)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)
 Caused by: java.io.IOException: error=12, Not enough space
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.init(UNIXProcess.java:53)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 16 more

        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:442)
        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400)
        at
 org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176)
        at
 org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:264)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)






 --
 Lance Norskog
 goks...@gmail.com



Re: Map task can't execute /bin/ls on solaris

2011-08-11 Thread Harsh J
It applies to all Hadoop daemon processes (JT, TT, NN, SNN, DN) and
all direct commands executed via the 'hadoop' executable.

On Thu, Aug 11, 2011 at 11:37 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:
 Is HADOOP_HEAPSIZE set for all Hadoop related Java processes, or just
 one Java process?

 Regards,

 Xiaobo Gu

 On Thu, Aug 11, 2011 at 1:07 PM, Lance Norskog goks...@gmail.com wrote:
 If the server is dedicated to this job, you might as well give it
 10-15g. After that shakes out, try changing the number of mappers 
 reducers.

 On Tue, Aug 9, 2011 at 2:06 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:
 Hi Adi,

 Thanks for your response, on an SMP server with 32G RAM and 8 Cores,
 what's your suggestion for setting HADOOP_HEAPSIZE, the server will be
 dedicated for a Single Node Hadoop with 1 data node instance, and the
 it will run 4 mapper and reducer tasks .

 Regards,

 Xiaobo Gu


 On Sun, Aug 7, 2011 at 11:35 PM, Adi adi.pan...@gmail.com wrote:
Caused by: java.io.IOException: error=12, Not enough space

 You either do not have enough memory allocated to your hadoop daemons(via
 HADOOP_HEAPSIZE) or swap space.

 -Adi

 On Sun, Aug 7, 2011 at 5:48 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote:

 Hi,

 I am trying to write a map-reduce job to convert csv files to
 sequencefiles, but the job fails with the following error:
 java.lang.RuntimeException: Error while running command to get file
 permissions : java.io.IOException: Cannot run program /bin/ls:
 error=12, Not enough space
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
        at org.apache.hadoop.util.Shell.run(Shell.java:182)
        at
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
        at
 org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540)
        at
 org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37)
        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:417)
        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400)
        at
 org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176)
        at
 org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:264)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)
 Caused by: java.io.IOException: error=12, Not enough space
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.init(UNIXProcess.java:53)
        at java.lang.ProcessImpl.start(ProcessImpl.java:65)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
        ... 16 more

        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:442)
        at
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400)
        at
 org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176)
        at
 org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:264)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
        at org.apache.hadoop.mapred.Child.main(Child.java:253)






 --
 Lance Norskog
 goks...@gmail.com





-- 
Harsh J


My cluster datanode machine can't start

2011-08-11 Thread devilsp4
Hi,

  I deploy hadoop cluster use two machine.one as a namenode,and the other 
be used a datanode.

  My namenode machine hostname is namenode1,and datanode machine hostname 
is datanode1.

  when I use command ./start-all.sh on namenode1,the console display below 
string,

root@namenode1:/opt/hadoop/bin# ./start-all.sh
starting namenode, logging to 
/opt/hadoop/bin/../logs/hadoop-root-namenode-namenode1.out
datanode1: starting datanode, logging to 
/opt/hadoop/bin/../logs/hadoop-root-datanode-datanode1.out
namenode1: starting secondarynamenode, logging to 
/opt/hadoop/bin/../logs/hadoop-root-secondarynamenode-namenode1.out
starting jobtracker, logging to 
/opt/hadoop/bin/../logs/hadoop-root-jobtracker-namenode1.out
datanode1: starting tasktracker, logging to 
/opt/hadoop/bin/../logs/hadoop-root-tasktracker-datanode1.out

and use jps show java processs,display below string,

15438 JobTracker
15159 NameNode
15582 Jps
15362 SecondaryNameNode

and ssh datanode1,use comman jps,display below somethins strings

21417 TaskTracker
21497 Jps


so,the datanode can't run,and I find logs 

[root@datanode1 logs]# ls
hadoop-root-datanode-datanode1.outhadoop-root-tasktracker-datanode1.log
hadoop-root-tasktracker-datanode1.out.2
hadoop-root-datanode-datanode1.out.1  hadoop-root-tasktracker-datanode1.out
hadoop-root-datanode-datanode1.out.2  hadoop-root-tasktracker-datanode1.out.1

[root@datanode1 logs]# cat hadoop-root-datanode-datanode1.out
Unrecognized option: -jvm
Could not create the Java virtual machine.


Next, what should I do to solve this problem。


Thanks. devilsp


Re: Where is web interface in stand alone operation?

2011-08-11 Thread A Df
Hi again:


I did format the namenode and it had a problem with a folder being locked. I 
tried again and it formatted but still unable to work. I tried to copy input 
files and run example jar. It gives:

my-user@ngs:~/hadoop-0.20.2_pseudo bin/hadoop fs -put input input
11/08/11 10:25:11 WARN hdfs.DFSClient: DataStreamer Exception: 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only 
be replicated to 0 nodes, instead of 1
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

    at org.apache.hadoop.ipc.Client.call(Client.java:740)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
    at $Proxy0.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy0.addBlock(Unknown Source)
    at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
    at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
    at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
    at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

11/08/11 10:25:11 WARN hdfs.DFSClient: Error Recovery for block null bad 
datanode[0] nodes == null
11/08/11 10:25:11 WARN hdfs.DFSClient: Could not get block locations. Source 
file /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt - 
Aborting...
put: java.io.IOException: File 
/user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only 
be replicated to 0 nodes, instead of 1
11/08/11 10:25:11 ERROR hdfs.DFSClient: Exception closing file 
/user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt : 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only 
be replicated to 0 nodes, instead of 1
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only 
be replicated to 0 nodes, instead of 1
    at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
    at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at 

Re: Where is web interface in stand alone operation?

2011-08-11 Thread Harsh J
Looks like your DataNode isn't properly up. Wipe your dfs.data.dir
directory and restart your DN (might be cause of the formatting
troubles you had earlier). Take a look at your DN's logs though, to
confirm and understand what's going wrong.

On Thu, Aug 11, 2011 at 3:03 PM, A Df abbey_dragonfor...@yahoo.com wrote:
 Hi again:

 I did format the namenode and it had a problem with a folder being locked. I
 tried again and it formatted but still unable to work. I tried to copy input
 files and run example jar. It gives:
 my-user@ngs:~/hadoop-0.20.2_pseudo bin/hadoop fs -put input input
 11/08/11 10:25:11 WARN hdfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could
 only be replicated to 0 nodes, instead of 1
     at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
     at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

     at org.apache.hadoop.ipc.Client.call(Client.java:740)
     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
     at $Proxy0.addBlock(Unknown Source)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
     at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
     at $Proxy0.addBlock(Unknown Source)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
     at
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

 11/08/11 10:25:11 WARN hdfs.DFSClient: Error Recovery for block null bad
 datanode[0] nodes == null
 11/08/11 10:25:11 WARN hdfs.DFSClient: Could not get block locations. Source
 file /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt -
 Aborting...
 put: java.io.IOException: File
 /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could
 only be replicated to 0 nodes, instead of 1
 11/08/11 10:25:11 ERROR hdfs.DFSClient: Exception closing file
 /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt :
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could
 only be replicated to 0 nodes, instead of 1
     at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
     at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could
 only be replicated to 0 nodes, instead of 1
     at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
     at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
     

Re: My cluster datanode machine can't start

2011-08-11 Thread Harsh J
A quick workaround is to not run your services as root.

(Actually, you shouldn't run Hadoop as root ever!)

On Thu, Aug 11, 2011 at 3:02 PM, devilsp4 devil...@gmail.com wrote:
 Hi,

      I deploy hadoop cluster use two machine.one as a namenode,and the other 
 be used a datanode.

      My namenode machine hostname is namenode1,and datanode machine hostname 
 is datanode1.

      when I use command ./start-all.sh on namenode1,the console display below 
 string,

 root@namenode1:/opt/hadoop/bin# ./start-all.sh
 starting namenode, logging to 
 /opt/hadoop/bin/../logs/hadoop-root-namenode-namenode1.out
 datanode1: starting datanode, logging to 
 /opt/hadoop/bin/../logs/hadoop-root-datanode-datanode1.out
 namenode1: starting secondarynamenode, logging to 
 /opt/hadoop/bin/../logs/hadoop-root-secondarynamenode-namenode1.out
 starting jobtracker, logging to 
 /opt/hadoop/bin/../logs/hadoop-root-jobtracker-namenode1.out
 datanode1: starting tasktracker, logging to 
 /opt/hadoop/bin/../logs/hadoop-root-tasktracker-datanode1.out

 and use jps show java processs,display below string,

    15438 JobTracker
 15159 NameNode
 15582 Jps
 15362 SecondaryNameNode

 and ssh datanode1,use comman jps,display below somethins strings

 21417 TaskTracker
 21497 Jps


 so,the datanode can't run,and I find logs

 [root@datanode1 logs]# ls
 hadoop-root-datanode-datanode1.out    hadoop-root-tasktracker-datanode1.log   
  hadoop-root-tasktracker-datanode1.out.2
 hadoop-root-datanode-datanode1.out.1  hadoop-root-tasktracker-datanode1.out
 hadoop-root-datanode-datanode1.out.2  hadoop-root-tasktracker-datanode1.out.1

 [root@datanode1 logs]# cat hadoop-root-datanode-datanode1.out
 Unrecognized option: -jvm
 Could not create the Java virtual machine.


    Next, what should I do to solve this problem。


    Thanks. devilsp




-- 
Harsh J


Re: Installing Hadoop

2011-08-11 Thread V@ni

 a
href=http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/;
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
 

Try dis link. It might be useful for u 


jgroups wrote:
 
 I am trying to install hadoop in cluster env with multiple nodes.
 Following instructions from
 
  http://hadoop.apache.org/common/docs/r0.17.0/cluster_setup.html
 http://hadoop.apache.org/common/docs/r0.17.0/cluster_setup.html 
 
 That page refers to hadoop-site.xml. But I don't see that in
 /hadoop-0.20.203.0/conf. Are there more upto date installation
 instructions somwhere else?
 

-- 
View this message in context: 
http://old.nabble.com/Installing-Hadoop-tp31683812p32240838.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: hadoop startup problem

2011-08-11 Thread V@ni

Hi, I had this issue, later found out that a folder everytime after executing 
has to be cleared, i'm not sure which folder in Hadoop has to be cleared. 



asmaa.atef wrote:
 
 hello everyone,
 i have a problem in hadoop startup ,every time i try to start hadoop name
 node doesnot start and when i tried to stop name node ,it gives an error
 :no name node to start.
 i tried to format the name node and it works well ,but now i have data in
 hadoop and formatting name node will erase all data.
 what can i do?
 thanks in advance,
 asmaa
 

-- 
View this message in context: 
http://old.nabble.com/hadoop-startup-problem-tp25800609p32240938.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: hadoop startup problem

2011-08-11 Thread V@ni

Hi, I had this issue, later found out that a folder everytime after executing 
has to be cleared, i'm not sure which folder in Hadoop has to be cleared. 



asmaa.atef wrote:
 
 hello everyone,
 i have a problem in hadoop startup ,every time i try to start hadoop name
 node doesnot start and when i tried to stop name node ,it gives an error
 :no name node to start.
 i tried to format the name node and it works well ,but now i have data in
 hadoop and formatting name node will erase all data.
 what can i do?
 thanks in advance,
 asmaa
 

-- 
View this message in context: 
http://old.nabble.com/hadoop-startup-problem-tp25800609p32240939.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: Where is web interface in stand alone operation?

2011-08-11 Thread A Df
Hello:

I can not add inline so here I go again. I check the datanode logs and it had a 
problem with the namespaceid for the namenode and datanode. I am not sure why 
since I did not change those variables. So sample log for those interested is 
below and my message continues after it. 


2011-08-11 10:23:58,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: ST
ARTUP_MSG:
/
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = ngs.wmin.ac.uk/161.74.12.97
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/b
ranch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
/
2011-08-11 10:23:59,208 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: j
ava.io.IOException: Incompatible namespaceIDs in /tmp/hadoop-w1153435/dfs/data:
namenode namespaceID = 915370409; datanode namespaceID = 1914136941
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataS
torage.java:233)
    at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionR
ead(DataStorage.java:148)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNod
e.java:298)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:
216)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode
.java:1283)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(D
ataNode.java:1238)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNo
de.java:1246)
    at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:13
68)

2011-08-11 10:23:59,209 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SH
UTDOWN_MSG:
/
SHUTDOWN_MSG: Shutting down DataNode at ngs.wmin.ac.uk/161.74.12.97
/


I stopped hadoop, deleted the data from the dfs.data.dir and restarted all. I 
had to also delete the input and output directories and setup those again, then 
it ran properly. I also tried the web interface and they both worked. Thanks. 


I will have a look at the logs however, since in standalone it does not use 
logs, another user suggesting using the time command and trace. How would I use 
strace since the job runs to completion and there is no time in between to run 
that command? I wanted to get job details such as those produced in the logs 
for the pseudo operation but instead details of the Java process. Is there a 
way to check the Java process for the standalone job while its running or 
afterwards?


Cheers,
A Df





From: Harsh J ha...@cloudera.com
To: A Df abbey_dragonfor...@yahoo.com
Cc: common-user@hadoop.apache.org common-user@hadoop.apache.org
Sent: Thursday, 11 August 2011, 10:37
Subject: Re: Where is web interface in stand alone operation?

Looks like your DataNode isn't properly up. Wipe your dfs.data.dir
directory and restart your DN (might be cause of the formatting
troubles you had earlier). Take a look at your DN's logs though, to
confirm and understand what's going wrong.

On Thu, Aug 11, 2011 at 3:03 PM, A Df abbey_dragonfor...@yahoo.com wrote:
 Hi again:

 I did format the namenode and it had a problem with a folder being locked. I
 tried again and it formatted but still unable to work. I tried to copy input
 files and run example jar. It gives:
 my-user@ngs:~/hadoop-0.20.2_pseudo bin/hadoop fs -put input input
 11/08/11 10:25:11 WARN hdfs.DFSClient: DataStreamer Exception:
 org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
 /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could
 only be replicated to 0 nodes, instead of 1
     at
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
     at
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
     at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
     at java.lang.reflect.Method.invoke(Method.java:597)
     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

     at org.apache.hadoop.ipc.Client.call(Client.java:740)
     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
     at 

Re: Map task can't execute /bin/ls on solaris

2011-08-11 Thread Adi
Some other options that effect the number of mappers and reducers and the
amount of memory they use:

mapred.child.java.opts*  *-Xmx1200M  (e.g. heap for your mapper/reducer or
any other java options) - this will decide the number of slots(512M) per
mapper

splitsize will effect the number of splits(and in effect the number of
mappers) depending on your input file and input format(in case you are using
fileinputformat or deriving from it)
mapreduce.input.fileinputformat.split.maxsize  max number of bytes
mapreduce.input.fileinputformat.split.minsize   min number of bytes

-Adi



On Thu, Aug 11, 2011 at 2:11 AM, Harsh J ha...@cloudera.com wrote:

 It applies to all Hadoop daemon processes (JT, TT, NN, SNN, DN) and
 all direct commands executed via the 'hadoop' executable.

 On Thu, Aug 11, 2011 at 11:37 AM, Xiaobo Gu guxiaobo1...@gmail.com
 wrote:
  Is HADOOP_HEAPSIZE set for all Hadoop related Java processes, or just
  one Java process?
 
  Regards,
 
  Xiaobo Gu
 
  On Thu, Aug 11, 2011 at 1:07 PM, Lance Norskog goks...@gmail.com
 wrote:
  If the server is dedicated to this job, you might as well give it
  10-15g. After that shakes out, try changing the number of mappers 
  reducers.
 
  On Tue, Aug 9, 2011 at 2:06 AM, Xiaobo Gu guxiaobo1...@gmail.com
 wrote:
  Hi Adi,
 
  Thanks for your response, on an SMP server with 32G RAM and 8 Cores,
  what's your suggestion for setting HADOOP_HEAPSIZE, the server will be
  dedicated for a Single Node Hadoop with 1 data node instance, and the
  it will run 4 mapper and reducer tasks .
 
  Regards,
 
  Xiaobo Gu
 
 
  On Sun, Aug 7, 2011 at 11:35 PM, Adi adi.pan...@gmail.com wrote:
 Caused by: java.io.IOException: error=12, Not enough space
 
  You either do not have enough memory allocated to your hadoop
 daemons(via
  HADOOP_HEAPSIZE) or swap space.
 
  -Adi
 
  On Sun, Aug 7, 2011 at 5:48 AM, Xiaobo Gu guxiaobo1...@gmail.com
 wrote:
 
  Hi,
 
  I am trying to write a map-reduce job to convert csv files to
  sequencefiles, but the job fails with the following error:
  java.lang.RuntimeException: Error while running command to get file
  permissions : java.io.IOException: Cannot run program /bin/ls:
  error=12, Not enough space
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
 at org.apache.hadoop.util.Shell.runCommand(Shell.java:200)
 at org.apache.hadoop.util.Shell.run(Shell.java:182)
 at
 
 org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:461)
 at org.apache.hadoop.util.Shell.execCommand(Shell.java:444)
 at
 
 org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540)
 at
 
 org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37)
 at
 
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:417)
 at
 
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400)
 at
  org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176)
 at
 
 org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:264)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.mapred.Child.main(Child.java:253)
  Caused by: java.io.IOException: error=12, Not enough space
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.init(UNIXProcess.java:53)
 at java.lang.ProcessImpl.start(ProcessImpl.java:65)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
 ... 16 more
 
 at
 
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:442)
 at
 
 org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400)
 at
  org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176)
 at
 
 org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:264)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
 at org.apache.hadoop.mapred.Child.main(Child.java:253)
 
 
 
 
 
 
  --
  Lance Norskog
  goks...@gmail.com
 
 



 --
 Harsh J



Re: java.lang.NoClassDefFoundError: com.sun.security.auth.UnixPrincipal

2011-08-11 Thread guxiaobo1...@gmail.com
Hi,

I failed to install Oracle JDK on IBM AIX, with the following errores:

$ chmod +x jdk-6u24-linux-i586.bin
$ ./jdk-6u24-linux-i586.bin
Unpacking...
Checksumming...
Extracting...
./jdk-6u24-linux-i586.bin: ./install.sfx.409892: 0403-006 Execute permission
denied.
Failed to extract the files.  Please refer to the Troubleshooting section of
the Installation Instructions on the download page for more information.


--
View this message in context: 
http://hadoop-common.472056.n3.nabble.com/java-lang-NoClassDefFoundError-com-sun-security-auth-UnixPrincipal-tp2989927p3246183.html
Sent from the Users mailing list archive at Nabble.com.


Hadoop in PBS + Lustre

2011-08-11 Thread Shi Yu

Hi,

I found some materials about submitting hadoop jobs via PBS.  Any idea 
how to interactively browse HDFS through PBS? Our supercomputer uses 
lustre storage system. I found a wiki talking about using Polyserve 
storage system but not using HDFS. Does anyone have tried lustre + 
hadoop + PBS? Any special care to do that? Thanks!


Shi


Hadoop users in Pittsburgh PA

2011-08-11 Thread Doug Balog
Hey, 
 If you are using Hadoop in Pittsburgh, please send me a quick note with a 
short description of what you are
doing. I'm especially interested in companies using Hadoop.

I run the local Pittsburgh Hadoop Users Group Meetup.

Thanks,
Doug



Avatar namenode?

2011-08-11 Thread shanmuganathan.r

Hi All,

  I am running the HBase distributed mode in seven node cluster with backup 
master. The HBase is running properly in the backup master environment. I want 
to run this HBase on top of the High Availability Hadoop. I saw about Avatar 
node in the following link 
http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html . 
  I need more help regarding this Avatar namenode configuration.


1. Which IP will be given to the Datanoe fs.default.name ?
2. Is there any good method other than avatar available for Backup namenode ?



Regards,

Shanmuganathan




Avatar namenode?

2011-08-11 Thread shanmuganathan.r

Hi All,

  I am running the HBase distributed mode in seven node cluster with backup 
master. The HBase is running properly in the backup master environment. I want 
to run this HBase on top of the High Availability Hadoop. I saw about Avatar 
node in the following link 
http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html . 
  I need more help regarding this Avatar namenode configuration.


1. Which IP will be given to the Datanoe fs.default.name ?
2. Is there any good method other than avatar available for Backup namenode ?



Regards,

Shanmuganathan





Question about RAID controllers and hadoop

2011-08-11 Thread Koert Kuipers
Hello all,
We are considering using low end HP proliant machines (DL160s and DL180s)
for cluster nodes. However with these machines if you want to do more than 4
hard drives then HP puts in a P410 raid controller. We would configure the
RAID controller to function as JBOD, by simply creating multiple RAID
volumes with one disk. Does anyone have experience with this setup? Is it a
good idea, or am i introducing a i/o bottleneck?
Thanks for your help!
Best, Koert


Re: Question about RAID controllers and hadoop

2011-08-11 Thread Bharath Mundlapudi
True, you need a P410 controller. You can create RAID0 for each disk to make it 
as JBOD.


-Bharath




From: Koert Kuipers ko...@tresata.com
To: common-user@hadoop.apache.org
Sent: Thursday, August 11, 2011 2:50 PM
Subject: Question about RAID controllers and hadoop

Hello all,
We are considering using low end HP proliant machines (DL160s and DL180s)
for cluster nodes. However with these machines if you want to do more than 4
hard drives then HP puts in a P410 raid controller. We would configure the
RAID controller to function as JBOD, by simply creating multiple RAID
volumes with one disk. Does anyone have experience with this setup? Is it a
good idea, or am i introducing a i/o bottleneck?
Thanks for your help!
Best, Koert

RE: Question about RAID controllers and hadoop

2011-08-11 Thread GOEKE, MATTHEW (AG/1000)
My assumption would be that having a set of 4 raid 0 disks would actually be 
better than having a controller that allowed pure JBOD of 4 disks due to the 
cache on the controller. If anyone has any personal experience with this I 
would love to know performance numbers but our infrastructure guy is doing 
tests on exactly this over the next couple days so I will pass it along once we 
have it.

Matt

-Original Message-
From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] 
Sent: Thursday, August 11, 2011 5:00 PM
To: common-user@hadoop.apache.org
Subject: Re: Question about RAID controllers and hadoop

True, you need a P410 controller. You can create RAID0 for each disk to make it 
as JBOD.


-Bharath




From: Koert Kuipers ko...@tresata.com
To: common-user@hadoop.apache.org
Sent: Thursday, August 11, 2011 2:50 PM
Subject: Question about RAID controllers and hadoop

Hello all,
We are considering using low end HP proliant machines (DL160s and DL180s)
for cluster nodes. However with these machines if you want to do more than 4
hard drives then HP puts in a P410 raid controller. We would configure the
RAID controller to function as JBOD, by simply creating multiple RAID
volumes with one disk. Does anyone have experience with this setup? Is it a
good idea, or am i introducing a i/o bottleneck?
Thanks for your help!
Best, Koert
This e-mail message may contain privileged and/or confidential information, and 
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please 
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of 
this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, 
reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking 
for the presence of Viruses or other Malware.
Monsanto, along with its subsidiaries, accepts no liability for any damage 
caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control 
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and 
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
information you are obligated to comply with all
applicable U.S. export laws and regulations.



Re: Question about RAID controllers and hadoop

2011-08-11 Thread Kai Voigt
Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry

But updated benchmark would be interesting to see.

Kai

Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.
 
 Matt
 
 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] 
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop
 
 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.
 
 
 -Bharath
 
 
 
 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop
 
 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.
 
 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for checking 
 for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.
 
 
 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.
 
 

-- 
Kai Voigt
k...@123.org






Re: Question about RAID controllers and hadoop

2011-08-11 Thread Charles Wimmer
We currently use P410s in 12 disk system.  Each disk is set up as a RAID0 
volume.  Performance is at least as good as a bare disk.


On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com 
wrote:

If I read that email chain correctly then they were referring to the classic 
JBOD vs multiple disks striped together conversation. The conversation that was 
started here is referring to JBOD vs 1 RAID 0 per disk and the effects of the 
raid controller on those independent raids.

Matt

-Original Message-
From: Kai Voigt [mailto:k...@123.org]
Sent: Thursday, August 11, 2011 5:17 PM
To: common-user@hadoop.apache.org
Subject: Re: Question about RAID controllers and hadoop

Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry

But updated benchmark would be interesting to see.

Kai

Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.

 Matt

 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.


 -Bharath



 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop

 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.

 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for checking 
 for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.


 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.



--
Kai Voigt
k...@123.org







Re: Question about RAID controllers and hadoop

2011-08-11 Thread Koert Kuipers
Hey Charles, I was considering using 8 drives, each set as RAID0, so its
good to hear such a setup is working for you.
Best Koert

On Thu, Aug 11, 2011 at 6:26 PM, Charles Wimmer cwim...@yahoo-inc.comwrote:

 We currently use P410s in 12 disk system.  Each disk is set up as a RAID0
 volume.  Performance is at least as good as a bare disk.


 On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com
 wrote:

 If I read that email chain correctly then they were referring to the
 classic JBOD vs multiple disks striped together conversation. The
 conversation that was started here is referring to JBOD vs 1 RAID 0 per disk
 and the effects of the raid controller on those independent raids.

 Matt

 -Original Message-
 From: Kai Voigt [mailto:k...@123.org]
 Sent: Thursday, August 11, 2011 5:17 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 Yahoo did some testing 2 years ago:
 http://markmail.org/message/xmzc45zi25htr7ry

 But updated benchmark would be interesting to see.

 Kai

 Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

  My assumption would be that having a set of 4 raid 0 disks would actually
 be better than having a controller that allowed pure JBOD of 4 disks due to
 the cache on the controller. If anyone has any personal experience with this
 I would love to know performance numbers but our infrastructure guy is doing
 tests on exactly this over the next couple days so I will pass it along once
 we have it.
 
  Matt
 
  -Original Message-
  From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
  Sent: Thursday, August 11, 2011 5:00 PM
  To: common-user@hadoop.apache.org
  Subject: Re: Question about RAID controllers and hadoop
 
  True, you need a P410 controller. You can create RAID0 for each disk to
 make it as JBOD.
 
 
  -Bharath
 
 
 
  
  From: Koert Kuipers ko...@tresata.com
  To: common-user@hadoop.apache.org
  Sent: Thursday, August 11, 2011 2:50 PM
  Subject: Question about RAID controllers and hadoop
 
  Hello all,
  We are considering using low end HP proliant machines (DL160s and DL180s)
  for cluster nodes. However with these machines if you want to do more
 than 4
  hard drives then HP puts in a P410 raid controller. We would configure
 the
  RAID controller to function as JBOD, by simply creating multiple RAID
  volumes with one disk. Does anyone have experience with this setup? Is it
 a
  good idea, or am i introducing a i/o bottleneck?
  Thanks for your help!
  Best, Koert
  This e-mail message may contain privileged and/or confidential
 information, and is intended to be received only by persons entitled
  to receive such information. If you have received this e-mail in error,
 please notify the sender immediately. Please delete it and
  all attachments from any servers, hard drives or any other media. Other
 use of this e-mail by you is strictly prohibited.
 
  All e-mails and attachments sent and received are subject to monitoring,
 reading and archival by Monsanto, including its
  subsidiaries. The recipient of this e-mail is solely responsible for
 checking for the presence of Viruses or other Malware.
  Monsanto, along with its subsidiaries, accepts no liability for any
 damage caused by any such code transmitted by or accompanying
  this e-mail or any attachment.
 
 
  The information contained in this email may be subject to the export
 control laws and regulations of the United States, potentially
  including but not limited to the Export Administration Regulations (EAR)
 and sanctions regulations issued by the U.S. Department of
  Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of
 this information you are obligated to comply with all
  applicable U.S. export laws and regulations.
 
 

 --
 Kai Voigt
 k...@123.org








Re: Question about RAID controllers and hadoop

2011-08-11 Thread Mohit Anchlia
On Thu, Aug 11, 2011 at 3:26 PM, Charles Wimmer cwim...@yahoo-inc.com wrote:
 We currently use P410s in 12 disk system.  Each disk is set up as a RAID0 
 volume.  Performance is at least as good as a bare disk.

Can you please share what throughput you see with P410s? Are these SATA or SAS?



 On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com 
 wrote:

 If I read that email chain correctly then they were referring to the classic 
 JBOD vs multiple disks striped together conversation. The conversation that 
 was started here is referring to JBOD vs 1 RAID 0 per disk and the effects of 
 the raid controller on those independent raids.

 Matt

 -Original Message-
 From: Kai Voigt [mailto:k...@123.org]
 Sent: Thursday, August 11, 2011 5:17 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 Yahoo did some testing 2 years ago: 
 http://markmail.org/message/xmzc45zi25htr7ry

 But updated benchmark would be interesting to see.

 Kai

 Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000):

 My assumption would be that having a set of 4 raid 0 disks would actually be 
 better than having a controller that allowed pure JBOD of 4 disks due to the 
 cache on the controller. If anyone has any personal experience with this I 
 would love to know performance numbers but our infrastructure guy is doing 
 tests on exactly this over the next couple days so I will pass it along once 
 we have it.

 Matt

 -Original Message-
 From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com]
 Sent: Thursday, August 11, 2011 5:00 PM
 To: common-user@hadoop.apache.org
 Subject: Re: Question about RAID controllers and hadoop

 True, you need a P410 controller. You can create RAID0 for each disk to make 
 it as JBOD.


 -Bharath



 
 From: Koert Kuipers ko...@tresata.com
 To: common-user@hadoop.apache.org
 Sent: Thursday, August 11, 2011 2:50 PM
 Subject: Question about RAID controllers and hadoop

 Hello all,
 We are considering using low end HP proliant machines (DL160s and DL180s)
 for cluster nodes. However with these machines if you want to do more than 4
 hard drives then HP puts in a P410 raid controller. We would configure the
 RAID controller to function as JBOD, by simply creating multiple RAID
 volumes with one disk. Does anyone have experience with this setup? Is it a
 good idea, or am i introducing a i/o bottleneck?
 Thanks for your help!
 Best, Koert
 This e-mail message may contain privileged and/or confidential information, 
 and is intended to be received only by persons entitled
 to receive such information. If you have received this e-mail in error, 
 please notify the sender immediately. Please delete it and
 all attachments from any servers, hard drives or any other media. Other use 
 of this e-mail by you is strictly prohibited.

 All e-mails and attachments sent and received are subject to monitoring, 
 reading and archival by Monsanto, including its
 subsidiaries. The recipient of this e-mail is solely responsible for 
 checking for the presence of Viruses or other Malware.
 Monsanto, along with its subsidiaries, accepts no liability for any damage 
 caused by any such code transmitted by or accompanying
 this e-mail or any attachment.


 The information contained in this email may be subject to the export control 
 laws and regulations of the United States, potentially
 including but not limited to the Export Administration Regulations (EAR) and 
 sanctions regulations issued by the U.S. Department of
 Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this 
 information you are obligated to comply with all
 applicable U.S. export laws and regulations.



 --
 Kai Voigt
 k...@123.org