Re: Map task can't execute /bin/ls on solaris
Is HADOOP_HEAPSIZE set for all Hadoop related Java processes, or just one Java process? Regards, Xiaobo Gu On Thu, Aug 11, 2011 at 1:07 PM, Lance Norskog goks...@gmail.com wrote: If the server is dedicated to this job, you might as well give it 10-15g. After that shakes out, try changing the number of mappers reducers. On Tue, Aug 9, 2011 at 2:06 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Hi Adi, Thanks for your response, on an SMP server with 32G RAM and 8 Cores, what's your suggestion for setting HADOOP_HEAPSIZE, the server will be dedicated for a Single Node Hadoop with 1 data node instance, and the it will run 4 mapper and reducer tasks . Regards, Xiaobo Gu On Sun, Aug 7, 2011 at 11:35 PM, Adi adi.pan...@gmail.com wrote: Caused by: java.io.IOException: error=12, Not enough space You either do not have enough memory allocated to your hadoop daemons(via HADOOP_HEAPSIZE) or swap space. -Adi On Sun, Aug 7, 2011 at 5:48 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Hi, I am trying to write a map-reduce job to convert csv files to sequencefiles, but the job fails with the following error: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: Cannot run program /bin/ls: error=12, Not enough space at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540) at org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:417) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:264) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) Caused by: java.io.IOException: error=12, Not enough space at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.init(UNIXProcess.java:53) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 16 more at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:442) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:264) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) -- Lance Norskog goks...@gmail.com
Re: Map task can't execute /bin/ls on solaris
It applies to all Hadoop daemon processes (JT, TT, NN, SNN, DN) and all direct commands executed via the 'hadoop' executable. On Thu, Aug 11, 2011 at 11:37 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Is HADOOP_HEAPSIZE set for all Hadoop related Java processes, or just one Java process? Regards, Xiaobo Gu On Thu, Aug 11, 2011 at 1:07 PM, Lance Norskog goks...@gmail.com wrote: If the server is dedicated to this job, you might as well give it 10-15g. After that shakes out, try changing the number of mappers reducers. On Tue, Aug 9, 2011 at 2:06 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Hi Adi, Thanks for your response, on an SMP server with 32G RAM and 8 Cores, what's your suggestion for setting HADOOP_HEAPSIZE, the server will be dedicated for a Single Node Hadoop with 1 data node instance, and the it will run 4 mapper and reducer tasks . Regards, Xiaobo Gu On Sun, Aug 7, 2011 at 11:35 PM, Adi adi.pan...@gmail.com wrote: Caused by: java.io.IOException: error=12, Not enough space You either do not have enough memory allocated to your hadoop daemons(via HADOOP_HEAPSIZE) or swap space. -Adi On Sun, Aug 7, 2011 at 5:48 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Hi, I am trying to write a map-reduce job to convert csv files to sequencefiles, but the job fails with the following error: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: Cannot run program /bin/ls: error=12, Not enough space at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540) at org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:417) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:264) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) Caused by: java.io.IOException: error=12, Not enough space at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.init(UNIXProcess.java:53) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 16 more at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:442) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:264) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) -- Lance Norskog goks...@gmail.com -- Harsh J
My cluster datanode machine can't start
Hi, I deploy hadoop cluster use two machine.one as a namenode,and the other be used a datanode. My namenode machine hostname is namenode1,and datanode machine hostname is datanode1. when I use command ./start-all.sh on namenode1,the console display below string, root@namenode1:/opt/hadoop/bin# ./start-all.sh starting namenode, logging to /opt/hadoop/bin/../logs/hadoop-root-namenode-namenode1.out datanode1: starting datanode, logging to /opt/hadoop/bin/../logs/hadoop-root-datanode-datanode1.out namenode1: starting secondarynamenode, logging to /opt/hadoop/bin/../logs/hadoop-root-secondarynamenode-namenode1.out starting jobtracker, logging to /opt/hadoop/bin/../logs/hadoop-root-jobtracker-namenode1.out datanode1: starting tasktracker, logging to /opt/hadoop/bin/../logs/hadoop-root-tasktracker-datanode1.out and use jps show java processs,display below string, 15438 JobTracker 15159 NameNode 15582 Jps 15362 SecondaryNameNode and ssh datanode1,use comman jps,display below somethins strings 21417 TaskTracker 21497 Jps so,the datanode can't run,and I find logs [root@datanode1 logs]# ls hadoop-root-datanode-datanode1.outhadoop-root-tasktracker-datanode1.log hadoop-root-tasktracker-datanode1.out.2 hadoop-root-datanode-datanode1.out.1 hadoop-root-tasktracker-datanode1.out hadoop-root-datanode-datanode1.out.2 hadoop-root-tasktracker-datanode1.out.1 [root@datanode1 logs]# cat hadoop-root-datanode-datanode1.out Unrecognized option: -jvm Could not create the Java virtual machine. Next, what should I do to solve this problem。 Thanks. devilsp
Re: Where is web interface in stand alone operation?
Hi again: I did format the namenode and it had a problem with a folder being locked. I tried again and it formatted but still unable to work. I tried to copy input files and run example jar. It gives: my-user@ngs:~/hadoop-0.20.2_pseudo bin/hadoop fs -put input input 11/08/11 10:25:11 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:740) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) 11/08/11 10:25:11 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 11/08/11 10:25:11 WARN hdfs.DFSClient: Could not get block locations. Source file /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt - Aborting... put: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 11/08/11 10:25:11 ERROR hdfs.DFSClient: Exception closing file /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at
Re: Where is web interface in stand alone operation?
Looks like your DataNode isn't properly up. Wipe your dfs.data.dir directory and restart your DN (might be cause of the formatting troubles you had earlier). Take a look at your DN's logs though, to confirm and understand what's going wrong. On Thu, Aug 11, 2011 at 3:03 PM, A Df abbey_dragonfor...@yahoo.com wrote: Hi again: I did format the namenode and it had a problem with a folder being locked. I tried again and it formatted but still unable to work. I tried to copy input files and run example jar. It gives: my-user@ngs:~/hadoop-0.20.2_pseudo bin/hadoop fs -put input input 11/08/11 10:25:11 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:740) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) 11/08/11 10:25:11 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 11/08/11 10:25:11 WARN hdfs.DFSClient: Could not get block locations. Source file /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt - Aborting... put: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 11/08/11 10:25:11 ERROR hdfs.DFSClient: Exception closing file /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt : org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
Re: My cluster datanode machine can't start
A quick workaround is to not run your services as root. (Actually, you shouldn't run Hadoop as root ever!) On Thu, Aug 11, 2011 at 3:02 PM, devilsp4 devil...@gmail.com wrote: Hi, I deploy hadoop cluster use two machine.one as a namenode,and the other be used a datanode. My namenode machine hostname is namenode1,and datanode machine hostname is datanode1. when I use command ./start-all.sh on namenode1,the console display below string, root@namenode1:/opt/hadoop/bin# ./start-all.sh starting namenode, logging to /opt/hadoop/bin/../logs/hadoop-root-namenode-namenode1.out datanode1: starting datanode, logging to /opt/hadoop/bin/../logs/hadoop-root-datanode-datanode1.out namenode1: starting secondarynamenode, logging to /opt/hadoop/bin/../logs/hadoop-root-secondarynamenode-namenode1.out starting jobtracker, logging to /opt/hadoop/bin/../logs/hadoop-root-jobtracker-namenode1.out datanode1: starting tasktracker, logging to /opt/hadoop/bin/../logs/hadoop-root-tasktracker-datanode1.out and use jps show java processs,display below string, 15438 JobTracker 15159 NameNode 15582 Jps 15362 SecondaryNameNode and ssh datanode1,use comman jps,display below somethins strings 21417 TaskTracker 21497 Jps so,the datanode can't run,and I find logs [root@datanode1 logs]# ls hadoop-root-datanode-datanode1.out hadoop-root-tasktracker-datanode1.log hadoop-root-tasktracker-datanode1.out.2 hadoop-root-datanode-datanode1.out.1 hadoop-root-tasktracker-datanode1.out hadoop-root-datanode-datanode1.out.2 hadoop-root-tasktracker-datanode1.out.1 [root@datanode1 logs]# cat hadoop-root-datanode-datanode1.out Unrecognized option: -jvm Could not create the Java virtual machine. Next, what should I do to solve this problem。 Thanks. devilsp -- Harsh J
Re: Installing Hadoop
a href=http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/; http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/ Try dis link. It might be useful for u jgroups wrote: I am trying to install hadoop in cluster env with multiple nodes. Following instructions from http://hadoop.apache.org/common/docs/r0.17.0/cluster_setup.html http://hadoop.apache.org/common/docs/r0.17.0/cluster_setup.html That page refers to hadoop-site.xml. But I don't see that in /hadoop-0.20.203.0/conf. Are there more upto date installation instructions somwhere else? -- View this message in context: http://old.nabble.com/Installing-Hadoop-tp31683812p32240838.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: hadoop startup problem
Hi, I had this issue, later found out that a folder everytime after executing has to be cleared, i'm not sure which folder in Hadoop has to be cleared. asmaa.atef wrote: hello everyone, i have a problem in hadoop startup ,every time i try to start hadoop name node doesnot start and when i tried to stop name node ,it gives an error :no name node to start. i tried to format the name node and it works well ,but now i have data in hadoop and formatting name node will erase all data. what can i do? thanks in advance, asmaa -- View this message in context: http://old.nabble.com/hadoop-startup-problem-tp25800609p32240938.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: hadoop startup problem
Hi, I had this issue, later found out that a folder everytime after executing has to be cleared, i'm not sure which folder in Hadoop has to be cleared. asmaa.atef wrote: hello everyone, i have a problem in hadoop startup ,every time i try to start hadoop name node doesnot start and when i tried to stop name node ,it gives an error :no name node to start. i tried to format the name node and it works well ,but now i have data in hadoop and formatting name node will erase all data. what can i do? thanks in advance, asmaa -- View this message in context: http://old.nabble.com/hadoop-startup-problem-tp25800609p32240939.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
Re: Where is web interface in stand alone operation?
Hello: I can not add inline so here I go again. I check the datanode logs and it had a problem with the namespaceid for the namenode and datanode. I am not sure why since I did not change those variables. So sample log for those interested is below and my message continues after it. 2011-08-11 10:23:58,630 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: ST ARTUP_MSG: / STARTUP_MSG: Starting DataNode STARTUP_MSG: host = ngs.wmin.ac.uk/161.74.12.97 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/b ranch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 / 2011-08-11 10:23:59,208 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: j ava.io.IOException: Incompatible namespaceIDs in /tmp/hadoop-w1153435/dfs/data: namenode namespaceID = 915370409; datanode namespaceID = 1914136941 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataS torage.java:233) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionR ead(DataStorage.java:148) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNod e.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java: 216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode .java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(D ataNode.java:1238) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNo de.java:1246) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:13 68) 2011-08-11 10:23:59,209 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SH UTDOWN_MSG: / SHUTDOWN_MSG: Shutting down DataNode at ngs.wmin.ac.uk/161.74.12.97 / I stopped hadoop, deleted the data from the dfs.data.dir and restarted all. I had to also delete the input and output directories and setup those again, then it ran properly. I also tried the web interface and they both worked. Thanks. I will have a look at the logs however, since in standalone it does not use logs, another user suggesting using the time command and trace. How would I use strace since the job runs to completion and there is no time in between to run that command? I wanted to get job details such as those produced in the logs for the pseudo operation but instead details of the Java process. Is there a way to check the Java process for the standalone job while its running or afterwards? Cheers, A Df From: Harsh J ha...@cloudera.com To: A Df abbey_dragonfor...@yahoo.com Cc: common-user@hadoop.apache.org common-user@hadoop.apache.org Sent: Thursday, 11 August 2011, 10:37 Subject: Re: Where is web interface in stand alone operation? Looks like your DataNode isn't properly up. Wipe your dfs.data.dir directory and restart your DN (might be cause of the formatting troubles you had earlier). Take a look at your DN's logs though, to confirm and understand what's going wrong. On Thu, Aug 11, 2011 at 3:03 PM, A Df abbey_dragonfor...@yahoo.com wrote: Hi again: I did format the namenode and it had a problem with a folder being locked. I tried again and it formatted but still unable to work. I tried to copy input files and run example jar. It gives: my-user@ngs:~/hadoop-0.20.2_pseudo bin/hadoop fs -put input input 11/08/11 10:25:11 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/my-user/input/HadoopInputFile_Request_2011-08-05_162106_1.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:740) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at
Re: Map task can't execute /bin/ls on solaris
Some other options that effect the number of mappers and reducers and the amount of memory they use: mapred.child.java.opts* *-Xmx1200M (e.g. heap for your mapper/reducer or any other java options) - this will decide the number of slots(512M) per mapper splitsize will effect the number of splits(and in effect the number of mappers) depending on your input file and input format(in case you are using fileinputformat or deriving from it) mapreduce.input.fileinputformat.split.maxsize max number of bytes mapreduce.input.fileinputformat.split.minsize min number of bytes -Adi On Thu, Aug 11, 2011 at 2:11 AM, Harsh J ha...@cloudera.com wrote: It applies to all Hadoop daemon processes (JT, TT, NN, SNN, DN) and all direct commands executed via the 'hadoop' executable. On Thu, Aug 11, 2011 at 11:37 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Is HADOOP_HEAPSIZE set for all Hadoop related Java processes, or just one Java process? Regards, Xiaobo Gu On Thu, Aug 11, 2011 at 1:07 PM, Lance Norskog goks...@gmail.com wrote: If the server is dedicated to this job, you might as well give it 10-15g. After that shakes out, try changing the number of mappers reducers. On Tue, Aug 9, 2011 at 2:06 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Hi Adi, Thanks for your response, on an SMP server with 32G RAM and 8 Cores, what's your suggestion for setting HADOOP_HEAPSIZE, the server will be dedicated for a Single Node Hadoop with 1 data node instance, and the it will run 4 mapper and reducer tasks . Regards, Xiaobo Gu On Sun, Aug 7, 2011 at 11:35 PM, Adi adi.pan...@gmail.com wrote: Caused by: java.io.IOException: error=12, Not enough space You either do not have enough memory allocated to your hadoop daemons(via HADOOP_HEAPSIZE) or swap space. -Adi On Sun, Aug 7, 2011 at 5:48 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: Hi, I am trying to write a map-reduce job to convert csv files to sequencefiles, but the job fails with the following error: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: Cannot run program /bin/ls: error=12, Not enough space at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) at org.apache.hadoop.util.Shell.runCommand(Shell.java:200) at org.apache.hadoop.util.Shell.run(Shell.java:182) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375) at org.apache.hadoop.util.Shell.execCommand(Shell.java:461) at org.apache.hadoop.util.Shell.execCommand(Shell.java:444) at org.apache.hadoop.fs.RawLocalFileSystem.execCommand(RawLocalFileSystem.java:540) at org.apache.hadoop.fs.RawLocalFileSystem.access$100(RawLocalFileSystem.java:37) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:417) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:264) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) Caused by: java.io.IOException: error=12, Not enough space at java.lang.UNIXProcess.forkAndExec(Native Method) at java.lang.UNIXProcess.init(UNIXProcess.java:53) at java.lang.ProcessImpl.start(ProcessImpl.java:65) at java.lang.ProcessBuilder.start(ProcessBuilder.java:453) ... 16 more at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.loadPermissionInfo(RawLocalFileSystem.java:442) at org.apache.hadoop.fs.RawLocalFileSystem$RawLocalFileStatus.getOwner(RawLocalFileSystem.java:400) at org.apache.hadoop.mapred.TaskLog.obtainLogDirOwner(TaskLog.java:176) at org.apache.hadoop.mapred.TaskLogsTruncater.truncateLogs(TaskLogsTruncater.java:124) at org.apache.hadoop.mapred.Child$4.run(Child.java:264) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:253) -- Lance Norskog goks...@gmail.com -- Harsh J
Re: java.lang.NoClassDefFoundError: com.sun.security.auth.UnixPrincipal
Hi, I failed to install Oracle JDK on IBM AIX, with the following errores: $ chmod +x jdk-6u24-linux-i586.bin $ ./jdk-6u24-linux-i586.bin Unpacking... Checksumming... Extracting... ./jdk-6u24-linux-i586.bin: ./install.sfx.409892: 0403-006 Execute permission denied. Failed to extract the files. Please refer to the Troubleshooting section of the Installation Instructions on the download page for more information. -- View this message in context: http://hadoop-common.472056.n3.nabble.com/java-lang-NoClassDefFoundError-com-sun-security-auth-UnixPrincipal-tp2989927p3246183.html Sent from the Users mailing list archive at Nabble.com.
Hadoop in PBS + Lustre
Hi, I found some materials about submitting hadoop jobs via PBS. Any idea how to interactively browse HDFS through PBS? Our supercomputer uses lustre storage system. I found a wiki talking about using Polyserve storage system but not using HDFS. Does anyone have tried lustre + hadoop + PBS? Any special care to do that? Thanks! Shi
Hadoop users in Pittsburgh PA
Hey, If you are using Hadoop in Pittsburgh, please send me a quick note with a short description of what you are doing. I'm especially interested in companies using Hadoop. I run the local Pittsburgh Hadoop Users Group Meetup. Thanks, Doug
Avatar namenode?
Hi All, I am running the HBase distributed mode in seven node cluster with backup master. The HBase is running properly in the backup master environment. I want to run this HBase on top of the High Availability Hadoop. I saw about Avatar node in the following link http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html . I need more help regarding this Avatar namenode configuration. 1. Which IP will be given to the Datanoe fs.default.name ? 2. Is there any good method other than avatar available for Backup namenode ? Regards, Shanmuganathan
Avatar namenode?
Hi All, I am running the HBase distributed mode in seven node cluster with backup master. The HBase is running properly in the backup master environment. I want to run this HBase on top of the High Availability Hadoop. I saw about Avatar node in the following link http://hadoopblog.blogspot.com/2010/02/hadoop-namenode-high-availability.html . I need more help regarding this Avatar namenode configuration. 1. Which IP will be given to the Datanoe fs.default.name ? 2. Is there any good method other than avatar available for Backup namenode ? Regards, Shanmuganathan
Question about RAID controllers and hadoop
Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert
Re: Question about RAID controllers and hadoop
True, you need a P410 controller. You can create RAID0 for each disk to make it as JBOD. -Bharath From: Koert Kuipers ko...@tresata.com To: common-user@hadoop.apache.org Sent: Thursday, August 11, 2011 2:50 PM Subject: Question about RAID controllers and hadoop Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert
RE: Question about RAID controllers and hadoop
My assumption would be that having a set of 4 raid 0 disks would actually be better than having a controller that allowed pure JBOD of 4 disks due to the cache on the controller. If anyone has any personal experience with this I would love to know performance numbers but our infrastructure guy is doing tests on exactly this over the next couple days so I will pass it along once we have it. Matt -Original Message- From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] Sent: Thursday, August 11, 2011 5:00 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop True, you need a P410 controller. You can create RAID0 for each disk to make it as JBOD. -Bharath From: Koert Kuipers ko...@tresata.com To: common-user@hadoop.apache.org Sent: Thursday, August 11, 2011 2:50 PM Subject: Question about RAID controllers and hadoop Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.
Re: Question about RAID controllers and hadoop
Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry But updated benchmark would be interesting to see. Kai Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000): My assumption would be that having a set of 4 raid 0 disks would actually be better than having a controller that allowed pure JBOD of 4 disks due to the cache on the controller. If anyone has any personal experience with this I would love to know performance numbers but our infrastructure guy is doing tests on exactly this over the next couple days so I will pass it along once we have it. Matt -Original Message- From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] Sent: Thursday, August 11, 2011 5:00 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop True, you need a P410 controller. You can create RAID0 for each disk to make it as JBOD. -Bharath From: Koert Kuipers ko...@tresata.com To: common-user@hadoop.apache.org Sent: Thursday, August 11, 2011 2:50 PM Subject: Question about RAID controllers and hadoop Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations. -- Kai Voigt k...@123.org
Re: Question about RAID controllers and hadoop
We currently use P410s in 12 disk system. Each disk is set up as a RAID0 volume. Performance is at least as good as a bare disk. On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com wrote: If I read that email chain correctly then they were referring to the classic JBOD vs multiple disks striped together conversation. The conversation that was started here is referring to JBOD vs 1 RAID 0 per disk and the effects of the raid controller on those independent raids. Matt -Original Message- From: Kai Voigt [mailto:k...@123.org] Sent: Thursday, August 11, 2011 5:17 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry But updated benchmark would be interesting to see. Kai Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000): My assumption would be that having a set of 4 raid 0 disks would actually be better than having a controller that allowed pure JBOD of 4 disks due to the cache on the controller. If anyone has any personal experience with this I would love to know performance numbers but our infrastructure guy is doing tests on exactly this over the next couple days so I will pass it along once we have it. Matt -Original Message- From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] Sent: Thursday, August 11, 2011 5:00 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop True, you need a P410 controller. You can create RAID0 for each disk to make it as JBOD. -Bharath From: Koert Kuipers ko...@tresata.com To: common-user@hadoop.apache.org Sent: Thursday, August 11, 2011 2:50 PM Subject: Question about RAID controllers and hadoop Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations. -- Kai Voigt k...@123.org
Re: Question about RAID controllers and hadoop
Hey Charles, I was considering using 8 drives, each set as RAID0, so its good to hear such a setup is working for you. Best Koert On Thu, Aug 11, 2011 at 6:26 PM, Charles Wimmer cwim...@yahoo-inc.comwrote: We currently use P410s in 12 disk system. Each disk is set up as a RAID0 volume. Performance is at least as good as a bare disk. On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com wrote: If I read that email chain correctly then they were referring to the classic JBOD vs multiple disks striped together conversation. The conversation that was started here is referring to JBOD vs 1 RAID 0 per disk and the effects of the raid controller on those independent raids. Matt -Original Message- From: Kai Voigt [mailto:k...@123.org] Sent: Thursday, August 11, 2011 5:17 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry But updated benchmark would be interesting to see. Kai Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000): My assumption would be that having a set of 4 raid 0 disks would actually be better than having a controller that allowed pure JBOD of 4 disks due to the cache on the controller. If anyone has any personal experience with this I would love to know performance numbers but our infrastructure guy is doing tests on exactly this over the next couple days so I will pass it along once we have it. Matt -Original Message- From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] Sent: Thursday, August 11, 2011 5:00 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop True, you need a P410 controller. You can create RAID0 for each disk to make it as JBOD. -Bharath From: Koert Kuipers ko...@tresata.com To: common-user@hadoop.apache.org Sent: Thursday, August 11, 2011 2:50 PM Subject: Question about RAID controllers and hadoop Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations. -- Kai Voigt k...@123.org
Re: Question about RAID controllers and hadoop
On Thu, Aug 11, 2011 at 3:26 PM, Charles Wimmer cwim...@yahoo-inc.com wrote: We currently use P410s in 12 disk system. Each disk is set up as a RAID0 volume. Performance is at least as good as a bare disk. Can you please share what throughput you see with P410s? Are these SATA or SAS? On 8/11/11 3:23 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com wrote: If I read that email chain correctly then they were referring to the classic JBOD vs multiple disks striped together conversation. The conversation that was started here is referring to JBOD vs 1 RAID 0 per disk and the effects of the raid controller on those independent raids. Matt -Original Message- From: Kai Voigt [mailto:k...@123.org] Sent: Thursday, August 11, 2011 5:17 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop Yahoo did some testing 2 years ago: http://markmail.org/message/xmzc45zi25htr7ry But updated benchmark would be interesting to see. Kai Am 12.08.2011 um 00:13 schrieb GOEKE, MATTHEW (AG/1000): My assumption would be that having a set of 4 raid 0 disks would actually be better than having a controller that allowed pure JBOD of 4 disks due to the cache on the controller. If anyone has any personal experience with this I would love to know performance numbers but our infrastructure guy is doing tests on exactly this over the next couple days so I will pass it along once we have it. Matt -Original Message- From: Bharath Mundlapudi [mailto:bharathw...@yahoo.com] Sent: Thursday, August 11, 2011 5:00 PM To: common-user@hadoop.apache.org Subject: Re: Question about RAID controllers and hadoop True, you need a P410 controller. You can create RAID0 for each disk to make it as JBOD. -Bharath From: Koert Kuipers ko...@tresata.com To: common-user@hadoop.apache.org Sent: Thursday, August 11, 2011 2:50 PM Subject: Question about RAID controllers and hadoop Hello all, We are considering using low end HP proliant machines (DL160s and DL180s) for cluster nodes. However with these machines if you want to do more than 4 hard drives then HP puts in a P410 raid controller. We would configure the RAID controller to function as JBOD, by simply creating multiple RAID volumes with one disk. Does anyone have experience with this setup? Is it a good idea, or am i introducing a i/o bottleneck? Thanks for your help! Best, Koert This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of Viruses or other Malware. Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export control laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations. -- Kai Voigt k...@123.org