Re: HDFS Explained as Comics
Hi all, very cool comic! Thanks, Alex On Wed, Nov 30, 2011 at 11:58 PM, Abhishek Pratap Singh manu.i...@gmail.com wrote: Hi, This is indeed a good way to explain, most of the improvement has already been discussed. waiting for sequel of this comic. Regards, Abhishek On Wed, Nov 30, 2011 at 1:55 PM, maneesh varshney mvarsh...@gmail.com wrote: Hi Matthew I agree with both you and Prashant. The strip needs to be modified to explain that these can be default values that can be optionally overridden (which I will fix in the next iteration). However, from the 'understanding concepts of HDFS' point of view, I still think that block size and replication factors are the real strengths of HDFS, and the learners must be exposed to them so that they get to see how hdfs is significantly different from conventional file systems. On personal note: thanks for the first part of your message :) -Maneesh On Wed, Nov 30, 2011 at 1:36 PM, GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com wrote: Maneesh, Firstly, I love the comic :) Secondly, I am inclined to agree with Prashant on this latest point. While one code path could take us through the user defining command line overrides (e.g. hadoop fs -D blah -put foo bar) I think it might confuse a person new to Hadoop. The most common flow would be using admin determined values from hdfs-site and the only thing that would need to change is that conversation happening between client / server and not user / client. Matt -Original Message- From: Prashant Kommireddi [mailto:prash1...@gmail.com] Sent: Wednesday, November 30, 2011 3:28 PM To: common-user@hadoop.apache.org Subject: Re: HDFS Explained as Comics Sure, its just a case of how readers interpret it. 1. Client is required to specify block size and replication factor each time 2. Client does not need to worry about it since an admin has set the properties in default configuration files A client could not be allowed to override the default configs if they are set final (well there are ways to go around it as well as you suggest by using create() :) The information is great and helpful. Just want to make sure a beginner who wants to write a WordCount in Mapreduce does not worry about specifying block size' and replication factor in his code. Thanks, Prashant On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney mvarsh...@gmail.com wrote: Hi Prashant Others may correct me if I am wrong here.. The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of block size and replication factor. In the source code, I see the following in the DFSClient constructor: defaultBlockSize = conf.getLong(dfs.block.size, DEFAULT_BLOCK_SIZE); defaultReplication = (short) conf.getInt(dfs.replication, 3); My understanding is that the client considers the following chain for the values: 1. Manual values (the long form constructor; when a user provides these values) 2. Configuration file values (these are cluster level defaults: dfs.block.size and dfs.replication) 3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3) Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the API to create a file is void create(, short replication, long blocksize); I presume it means that the client already has knowledge of these values and passes them to the NameNode when creating a new file. Hope that helps. thanks -Maneesh On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi prash1...@gmail.com wrote: Thanks Maneesh. Quick question, does a client really need to know Block size and replication factor - A lot of times client has no control over these (set at cluster level) -Prashant Kommireddi On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges dejan.men...@gmail.com wrote: Hi Maneesh, Thanks a lot for this! Just distributed it over the team and comments are great :) Best regards, Dejan On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney mvarsh...@gmail.com wrote: For your reading pleasure! PDF 3.3MB uploaded at (the mailing list has a cap of 1MB attachments): https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1 Appreciate if you can spare some time to peruse this little experiment of mine to use Comics as a medium to explain computer science topics. This particular issue explains the protocols and internals of HDFS. I am eager to hear your opinions on the usefulness of this visual
Re: Issue with DistributedCache
Hi, a typo? import com.bejoy.sampels.worcount.WordCountDriver; = wor_d_count ? - alex On Thu, Nov 24, 2011 at 3:45 PM, Bejoy Ks bejoy.had...@gmail.com wrote: Hi Denis I tried your code with out distributed cache locally and it worked fine for me. Please find it at http://pastebin.com/ki175YUx I echo Mike's words in submitting a map reduce jobs remotely. The remote machine can be your local PC or any utility server as Mike specified. What you need to have in remote machine is a replica of hadoop jars and configuration files same as that of your hadoop cluster. (If you don't have a remote util server set up then you can use your dev machine for the same). Just trigger the hadoop job on local machine and the actual job would be submitted and running on your cluster based on the NN host and configuration parameters you have on your config files. Hope it helps!.. Regards Bejoy.K.S On Thu, Nov 24, 2011 at 7:09 PM, Michel Segel michael_se...@hotmail.com wrote: Denis... Sorry, you lost me. Just to make sure we're using the same terminology... The cluster is comprised of two types of nodes... The data nodes which run DN,TT, and if you have HBase, RS. Then there are control nodes which run you NN,SN, JT and if you run HBase, HM and ZKs ... Outside of the cluster we have machines set up with Hadoop installed but are not running any of the processes. They are where our users launch there jobs. We call them edge nodes. ( it's not a good idea to let users directly on the actual cluster.) Ok, having said all of that... You launch you job from the edge nodes... Your data sits in HDFS so you don't need distributed cache at all. Does that make sense? You job will run on the local machine, connect to the JT and then run. We set up the edge nodes so that all of the jars, config files are already set up for the users and we can better control access... Sent from a remote device. Please excuse any typos... Mike Segel On Nov 24, 2011, at 7:22 AM, Denis Kreis de.kr...@gmail.com wrote: Without using the distributed cache i'm getting the same error. It's because i start the job from a remote client / programmatically 2011/11/24 Michel Segel michael_se...@hotmail.com: Silly question... Why do you need to use the distributed cache for the word count program? What are you trying to accomplish? I've only had to play with it for one project where we had to push out a bunch of c++ code to the nodes as part of a job... Sent from a remote device. Please excuse any typos... Mike Segel On Nov 24, 2011, at 7:05 AM, Denis Kreis de.kr...@gmail.com wrote: Hi Bejoy 1. Old API: The Map and Reduce classes are the same as in the example, the main method is as follows public static void main(String[] args) throws IOException, InterruptedException { UserGroupInformation ugi = UserGroupInformation.createProxyUser(remote user name, UserGroupInformation.getLoginUser()); ugi.doAs(new PrivilegedExceptionActionVoid() { public Void run() throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName(wordcount); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(path to input dir)); FileOutputFormat.setOutputPath(conf, new Path(path to output dir)); conf.set(mapred.job.tracker, ip:8021); FileSystem fs = FileSystem.get(new URI(hdfs://ip:8020), new Configuration()); fs.mkdirs(new Path(remote path)); fs.copyFromLocalFile(new Path(local path/test.jar), new Path(remote path)); -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*
new LAB VM online
Hi, I created a new testing environment as VirtualBox - Image. Contains 4 Servers, CDH3u2, hBase, hive, Stargate, sqoop. I use them for testing, I dont know if anyone will use them too. The image has around 4 GB and will deploy 4 server with 40GB HDD. I wrote a site about in my blog. I think for new users the setup could be very helpfull (I use them in my Macbook when I travel, and it works) best, Alex -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*
Re: how to start tasktracker only on single port
Hi, please explain the reason to kill (I assume kill -9) a tasktracker. The best way is to use the start / stop scripts. best, Alex On Mon, Nov 14, 2011 at 8:39 AM, mohmmadanis moulavi anis_moul...@yahoo.co.in wrote: Hello, Friends I am using Hadoop 0.20.2 version, My problem is whenever I kill the tasktracker and start it again, jobtrakcer shows one extra tasktracker (the one which is killed the other which has started afterwords) I want to do it like this, Whenever I kill the tasktracker it will stop sending the heartbeats, but when I again start tasktracker, It should start again sending heartbeats, i.e it should start that tasktrakcer on same port as that of before, what changes should I made in configuration parameters for that, please let me know it Thanks Regads, Mohmmadanis Moulavi -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*
Re: OceanSync
Hi, http://www.cloudera.com/products-services/scm-express/ works. I didn't know OceanSync, sorry. - Alex On Tue, Nov 8, 2011 at 8:14 AM, DesignerSmoke designersm...@yahoo.comwrote: Does anyone have BETA access into the OceanSync's Hadoop Management Software? The website is http://www.oceansync.com and sourceforge page is http://sourceforge.net/p/oceansync/ Is there other software like this somewhere? -- View this message in context: http://old.nabble.com/OceanSync-tp32801476p32801476.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- Alexander Lorenz http://mapredit.blogspot.com *P **Think of the environment: please don't print this email unless you really need to.*
Re: correct way to reserve space
Hi, in hdfs-site.xml: property namedfs.datanode.du.reserved/name value VALUE_HERE (maybe 300) /value /property property namedfs.datanode.du.pct/name value0.9f/value /property But not available for each partition, its a general parameter. Correct me please if I wrong. regards, Alex On Thu, Oct 27, 2011 at 2:24 AM, Rita rmorgan...@gmail.com wrote: What is the correct way to reserve space for hdfs? I currently have 2 filesystem, /fs1 and /fs2 and I would like to reserve space for non-dfs operations. For example, for /fs1 i would like to reserve 30gb of space for non-dfs and 10gb of space for /fs2 ? I fear HADOOP-2991 is still haunting us? I am using CDH 3U1 -- --- Get your facts first, then you can distort them as you please.-- -- Alexander Lorenz http://mapredit.blogspot.com
Re: correct way to reserve space
Hi Harsh, ah, nice to know, thanks ;) best, Alex On Thu, Oct 27, 2011 at 9:27 AM, Harsh J ha...@cloudera.com wrote: The percentage opt is not valid on 0.20 iirc. On Thursday, October 27, 2011, Alexander C.H. Lorenz wget.n...@googlemail.com wrote: Hi, in hdfs-site.xml: property namedfs.datanode.du.reserved/name value VALUE_HERE (maybe 300) /value /property property namedfs.datanode.du.pct/name value0.9f/value /property But not available for each partition, its a general parameter. Correct me please if I wrong. regards, Alex On Thu, Oct 27, 2011 at 2:24 AM, Rita rmorgan...@gmail.com wrote: What is the correct way to reserve space for hdfs? I currently have 2 filesystem, /fs1 and /fs2 and I would like to reserve space for non-dfs operations. For example, for /fs1 i would like to reserve 30gb of space for non-dfs and 10gb of space for /fs2 ? I fear HADOOP-2991 is still haunting us? I am using CDH 3U1 -- --- Get your facts first, then you can distort them as you please.-- -- Alexander Lorenz http://mapredit.blogspot.com -- Harsh J -- Alexander Lorenz http://mapredit.blogspot.com
Re: running sqoop on hadoop cluster
Hi, first setup a valid cluster: namenode, secondary namenode, jobtracker + datanodes with tasktracker. After that install sqoop on a datanode and play with ;) Here a howto for RedHat (CentOS) http://mapredit.blogspot.com/p/get-hadoop-cluster-running-in-20.html and for Ubuntu: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/ regards, Alex On Fri, Oct 21, 2011 at 2:03 AM, firantika firantika.agust...@gmail.comwrote: Hi All, i'm newbie on hadoop, if i installed hadoop on 2 node, where is hdfs running ? on master or slave node ? and then if i running sqoop for export dbms to hive, is it give effect on speed up system between hadoop which running on single node and hadoop multi node ? please give me explaining ? Tks -- View this message in context: http://old.nabble.com/running-sqoop-on-hadoop-cluster-tp32693398p32693398.html Sent from the Hadoop core-user mailing list archive at Nabble.com. -- Alexander Lorenz http://mapredit.blogspot.com
Re: jobtracker cannot be started
Hi, what are the heap size you given at the jobtracker? And how much jobs / users / tasks are run? What say a log? Turn on GC logging: http://java.sun.com/developer/technicalArticles/Programming/GCPortal/ - Alex On Fri, Oct 21, 2011 at 9:47 AM, Peng, Wei wei.p...@xerox.com wrote: Hi, When I was running a job on hadoop with 75% mappers finished, the jobtracker hung so that I cannot access jobtrackerserver:7845/jobtracker.jsp and hadoop job -status hung as well. Then I stopped jobtracker and restarted it. However, the jobtracker cannot be started. I received error message from jobtracker.log.out saying Exception in thread LeaseChecker java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.init(BufferedOutputStream.java:59) at java.io.BufferedOutputStream.init(BufferedOutputStream.java:42) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:318) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:859) at org.apache.hadoop.ipc.Client.call(Client.java:719) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo cationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation Handler.java:59) at $Proxy4.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.renew(DFSClient.java:1016) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1028) at java.lang.Thread.run(Thread.java:619) Exception in thread expireTrackers java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.jav a:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuffer.append(StringBuffer.java:224) at org.apache.hadoop.mapred.JobHistory.log(JobHistory.java:354) at org.apache.hadoop.mapred.JobHistory$MapAttempt.logStarted(JobHistory.jav a:1354) at org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:233 2) at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.ja va:849) at org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:246 3) at org.apache.hadoop.mapred.JobTracker.lostTaskTracker(JobTracker.java:3474 ) at org.apache.hadoop.mapred.JobTracker$ExpireTrackers.run(JobTracker.java:3 48) at java.lang.Thread.run(Thread.java:619) Exception in thread IPC Server listener on 9001 java.lang.OutOfMemoryError: Java heap space java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.log.Slf4jLog.warn(Slf4jLog.java:126) at org.mortbay.log.Log.warn(Log.java:181) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:449) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2 16) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne ction.java:864) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java: 409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java :522) Caused by: java.lang.OutOfMemoryError: Java heap space
Re: jobtracker cannot be started
add into hadoop-env.sh the opts to the jdk-call. The logs should be accessible at he hadoop-log-directory. Also check http://jobtracker:50030/stacks - thats the same as jstack (jstack PID). Also you can use jstack -F PID to get a corefile (similar to /stacks I think) @jobtracker. Are you using 64bit-JDK? Which version? regards, Alex On Fri, Oct 21, 2011 at 10:00 AM, Peng, Wei wei.p...@xerox.com wrote: I am using the default heap size, which is 1000MB. The jobtracker hung when only I was running one job. Now I could not even restart the jobtracker. Can you teach me how to turn on GC logging in hadoop? Thanks! Wei -Original Message- From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com] Sent: Friday, October 21, 2011 3:54 AM To: common-user@hadoop.apache.org Subject: Re: jobtracker cannot be started Hi, what are the heap size you given at the jobtracker? And how much jobs / users / tasks are run? What say a log? Turn on GC logging: http://java.sun.com/developer/technicalArticles/Programming/GCPortal/ - Alex On Fri, Oct 21, 2011 at 9:47 AM, Peng, Wei wei.p...@xerox.com wrote: Hi, When I was running a job on hadoop with 75% mappers finished, the jobtracker hung so that I cannot access jobtrackerserver:7845/jobtracker.jsp and hadoop job -status hung as well. Then I stopped jobtracker and restarted it. However, the jobtracker cannot be started. I received error message from jobtracker.log.out saying Exception in thread LeaseChecker java.lang.OutOfMemoryError: Java heap space at java.io.BufferedOutputStream.init(BufferedOutputStream.java:59) at java.io.BufferedOutputStream.init(BufferedOutputStream.java:42) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:318) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) at org.apache.hadoop.ipc.Client.getConnection(Client.java:859) at org.apache.hadoop.ipc.Client.call(Client.java:719) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.renewLease(Unknown Source) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo cationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation Handler.java:59) at $Proxy4.renewLease(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.renew(DFSClient.java:1016) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1028) at java.lang.Thread.run(Thread.java:619) Exception in thread expireTrackers java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.jav a:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuffer.append(StringBuffer.java:224) at org.apache.hadoop.mapred.JobHistory.log(JobHistory.java:354) at org.apache.hadoop.mapred.JobHistory$MapAttempt.logStarted(JobHistory.jav a:1354) at org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:233 2) at org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.ja va:849) at org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:246 3) at org.apache.hadoop.mapred.JobTracker.lostTaskTracker(JobTracker.java:3474 ) at org.apache.hadoop.mapred.JobTracker$ExpireTrackers.run(JobTracker.java:3 48) at java.lang.Thread.run(Thread.java:619) Exception in thread IPC Server listener on 9001 java.lang.OutOfMemoryError: Java heap space java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.log.Slf4jLog.warn(Slf4jLog.java:126) at org.mortbay.log.Log.warn(Log.java:181) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:449) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2 16) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766
Re: jobtracker cannot be started
) org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.ja va:429) org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:185) org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnect or.java:124) org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java: 707) org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java :522) Thread 22 (org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor@67c7 980c): State: TIMED_WAITING Blocked count: 5 Waited count: 127 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(D ecommissionManager.java:65) java.lang.Thread.run(Thread.java:619) Thread 21 (org.apache.hadoop.hdfs.server.namenode.FSNamesystem$ReplicationMonitor@ 2094257f): State: TIMED_WAITING Blocked count: 20 Waited count: 1263 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.hdfs.server.namenode.FSNamesystem$ReplicationMonitor.r un(FSNamesystem.java:2304) java.lang.Thread.run(Thread.java:619) Thread 20 (org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@3a51127a): State: TIMED_WAITING Blocked count: 21 Waited count: 1875 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseMan ager.java:349) java.lang.Thread.run(Thread.java:619) Thread 19 (org.apache.hadoop.hdfs.server.namenode.FSNamesystem$HeartbeatMonitor@61 578aab): State: TIMED_WAITING Blocked count: 0 Waited count: 13 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.hdfs.server.namenode.FSNamesystem$HeartbeatMonitor.run (FSNamesystem.java:2286) java.lang.Thread.run(Thread.java:619) Thread 18 (org.apache.hadoop.hdfs.server.namenode.PendingReplicationBlocks$Pending ReplicationMonitor@2339e351): State: TIMED_WAITING Blocked count: 0 Waited count: 13 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.hdfs.server.namenode.PendingReplicationBlocks$PendingR eplicationMonitor.run(PendingReplicationBlocks.java:186) java.lang.Thread.run(Thread.java:619) Thread 9 (RMI TCP Accept-0): State: RUNNABLE Blocked count: 0 Waited count: 0 Stack: java.net.PlainSocketImpl.socketAccept(Native Method) java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384) java.net.ServerSocket.implAccept(ServerSocket.java:453) java.net.ServerSocket.accept(ServerSocket.java:421) sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTrans port.java:369) sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341) java.lang.Thread.run(Thread.java:619) Thread 4 (Signal Dispatcher): State: RUNNABLE Blocked count: 0 Waited count: 0 Stack: Thread 3 (Finalizer): State: WAITING Blocked count: 0 Waited count: 40 Waiting on java.lang.ref.ReferenceQueue$Lock@22f62eba Stack: java.lang.Object.wait(Native Method) java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116) java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132) java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Thread 2 (Reference Handler): State: WAITING Blocked count: 1 Waited count: 39 Waiting on java.lang.ref.Reference$Lock@646d6aa0 Stack: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:485) java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) Thread 1 (main): State: WAITING Blocked count: 10 Waited count: 10 Waiting on org.apache.hadoop.ipc.RPC$Server@41f6321 Stack: java.lang.Object.wait(Native Method) java.lang.Object.wait(Object.java:485) org.apache.hadoop.ipc.Server.join(Server.java:1122) org.apache.hadoop.hdfs.server.namenode.NameNode.join(NameNode.java:292) org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:966) Wei -Original Message- From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com] Sent: Friday, October 21, 2011 4:15 AM To: common-user@hadoop.apache.org Subject: Re: jobtracker cannot be started add into hadoop-env.sh the opts to the jdk-call. The logs should be accessible at he hadoop-log-directory. Also check http://jobtracker:50030/stacks - thats the same as jstack (jstack PID). Also you can use jstack -F PID to get a corefile (similar to /stacks I think) @jobtracker. Are you using 64bit-JDK? Which version? regards, Alex On Fri, Oct 21, 2011 at 10:00 AM, Peng, Wei wei.p...@xerox.com wrote: I am using the default heap size, which is 1000MB. The jobtracker hung when only I was running one job. Now I could not even restart the jobtracker. Can you teach me how to turn on GC logging in hadoop? Thanks! Wei -Original Message- From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com] Sent: Friday, October 21, 2011 3:54 AM To: common-user@hadoop.apache.org Subject: Re: jobtracker
Re: jobtracker cannot be started
should, yes ;) I use 2000 in our environment, but depends on the memory on your servers. regards, Alex On Fri, Oct 21, 2011 at 10:58 AM, Peng, Wei wei.p...@xerox.com wrote: Yes, the heap size the default 1000m. /bin/java -Xmx1000m So if I can change the heapsize to be bigger, I should be able to solve this problem? Thanks, Wei -Original Message- From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com] Sent: Friday, October 21, 2011 4:53 AM To: common-user@hadoop.apache.org Subject: Re: jobtracker cannot be started looks like that the Heap utilization has exceeded the value set by -Xmx. Do a ps waux|grep java @jobtracker |grep -i xmx The heapsize will be set in hadoop-env.sh: export HADOOP_HEAPSIZE= default 1000, I think. - alex On Fri, Oct 21, 2011 at 10:31 AM, Peng, Wei wei.p...@xerox.com wrote: Thank you for your quick reply!! I cannot change the hadoop conf files because they are owned by a person who has left the company, though I have the root access. My Java version is java version 1.5.0_07 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_07-b03) Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_07-b03, mixed mode) The log on http://jobtracker:50030/stacks is Process Thread Dump: 26 active threads Thread 53 (1424598978@qtp0-5): State: RUNNABLE Blocked count: 0 Waited count: 29 Stack: sun.management.ThreadImpl.getThreadInfo0(Native Method) sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147) sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:123) org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.j ava:149) org.apache.hadoop.http.HttpServer$StackServlet.doGet(HttpServer.java:505 ) javax.servlet.http.HttpServlet.service(HttpServlet.java:707) javax.servlet.http.HttpServlet.service(HttpServlet.java:820) org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363) org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2 16) org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:230) org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) org.mortbay.jetty.Server.handle(Server.java:324) org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne ction.java:864) org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) Thread 43 (Trash Emptier): State: TIMED_WAITING Blocked count: 0 Waited count: 183 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.fs.Trash$Emptier.run(Trash.java:234) java.lang.Thread.run(Thread.java:619) Thread 36 (IPC Server handler 9 on 9000): State: WAITING Blocked count: 32 Waited count: 3444 Waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49 59d87f Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw ait(AbstractQueuedSynchronizer.java:1925) java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3 58) org.apache.hadoop.ipc.Server$Handler.run(Server.java:939) Thread 35 (IPC Server handler 8 on 9000): State: WAITING Blocked count: 29 Waited count: 3446 Waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49 59d87f Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw ait(AbstractQueuedSynchronizer.java:1925) java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3 58) org.apache.hadoop.ipc.Server$Handler.run(Server.java:939) Thread 34 (IPC Server handler 7 on 9000): State: WAITING Blocked count: 30 Waited count: 3451 Waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49 59d87f Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw ait(AbstractQueuedSynchronizer.java:1925) java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3 58) org.apache.hadoop.ipc.Server$Handler.run(Server.java:939) Thread 33 (IPC Server handler 6 on 9000): State
Re: jobtracker cannot be started
How large is the blocksize? Hadoop uses a buffer size of 4kb for IO operations, I can imagine that the balancer take a lot of memory. Check if they running. Also a good idea will be to increase the space (add more datanodes). Check the fs.trash.interval setting or not (no settings disable the trash-facility). also you can use # hadoop dfs -expunge And, at least, check the limits on the server (limits.conf) - alex On Fri, Oct 21, 2011 at 12:09 PM, Peng, Wei wei.p...@xerox.com wrote: Alex, thank you a lot for helping me. I will figure out how to change the conf file. It seems that even chattr -i does not work. Just one last question, why restarting the jobtracker needs such a big heap size? I had no problem to restart it before the jobtracker hung ? One problem of this hadoop cluster that I did not mention is the DFS space only has 5% left. Thanks Wei -Original Message- From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com] Sent: Friday, October 21, 2011 5:01 AM To: common-user@hadoop.apache.org Subject: Re: jobtracker cannot be started should, yes ;) I use 2000 in our environment, but depends on the memory on your servers. regards, Alex On Fri, Oct 21, 2011 at 10:58 AM, Peng, Wei wei.p...@xerox.com wrote: Yes, the heap size the default 1000m. /bin/java -Xmx1000m So if I can change the heapsize to be bigger, I should be able to solve this problem? Thanks, Wei -Original Message- From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com] Sent: Friday, October 21, 2011 4:53 AM To: common-user@hadoop.apache.org Subject: Re: jobtracker cannot be started looks like that the Heap utilization has exceeded the value set by -Xmx. Do a ps waux|grep java @jobtracker |grep -i xmx The heapsize will be set in hadoop-env.sh: export HADOOP_HEAPSIZE= default 1000, I think. - alex On Fri, Oct 21, 2011 at 10:31 AM, Peng, Wei wei.p...@xerox.com wrote: Thank you for your quick reply!! I cannot change the hadoop conf files because they are owned by a person who has left the company, though I have the root access. My Java version is java version 1.5.0_07 Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_07-b03) Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_07-b03, mixed mode) The log on http://jobtracker:50030/stacks is Process Thread Dump: 26 active threads Thread 53 (1424598978@qtp0-5): State: RUNNABLE Blocked count: 0 Waited count: 29 Stack: sun.management.ThreadImpl.getThreadInfo0(Native Method) sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147) sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:123) org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.j ava:149) org.apache.hadoop.http.HttpServer$StackServlet.doGet(HttpServer.java:505 ) javax.servlet.http.HttpServlet.service(HttpServlet.java:707) javax.servlet.http.HttpServlet.service(HttpServlet.java:820) org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363) org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2 16) org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:230) org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) org.mortbay.jetty.Server.handle(Server.java:324) org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne ction.java:864) org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) Thread 43 (Trash Emptier): State: TIMED_WAITING Blocked count: 0 Waited count: 183 Stack: java.lang.Thread.sleep(Native Method) org.apache.hadoop.fs.Trash$Emptier.run(Trash.java:234) java.lang.Thread.run(Thread.java:619) Thread 36 (IPC Server handler 9 on 9000): State: WAITING Blocked count: 32 Waited count: 3444 Waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49 59d87f Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw ait(AbstractQueuedSynchronizer.java:1925) java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3 58) org.apache.hadoop.ipc.Server$Handler.run(Server.java:939) Thread 35 (IPC