Re: HDFS Explained as Comics

2011-11-30 Thread Alexander C.H. Lorenz
Hi all,

very cool comic!

Thanks,
 Alex

On Wed, Nov 30, 2011 at 11:58 PM, Abhishek Pratap Singh manu.i...@gmail.com
 wrote:

 Hi,

 This is indeed a good way to explain, most of the improvement has already
 been discussed. waiting for sequel of this comic.

 Regards,
 Abhishek

 On Wed, Nov 30, 2011 at 1:55 PM, maneesh varshney mvarsh...@gmail.com
 wrote:

  Hi Matthew
 
  I agree with both you and Prashant. The strip needs to be modified to
  explain that these can be default values that can be optionally
 overridden
  (which I will fix in the next iteration).
 
  However, from the 'understanding concepts of HDFS' point of view, I still
  think that block size and replication factors are the real strengths of
  HDFS, and the learners must be exposed to them so that they get to see
 how
  hdfs is significantly different from conventional file systems.
 
  On personal note: thanks for the first part of your message :)
 
  -Maneesh
 
 
  On Wed, Nov 30, 2011 at 1:36 PM, GOEKE, MATTHEW (AG/1000) 
  matthew.go...@monsanto.com wrote:
 
   Maneesh,
  
   Firstly, I love the comic :)
  
   Secondly, I am inclined to agree with Prashant on this latest point.
  While
   one code path could take us through the user defining command line
   overrides (e.g. hadoop fs -D blah -put foo bar) I think it might
 confuse
  a
   person new to Hadoop. The most common flow would be using admin
  determined
   values from hdfs-site and the only thing that would need to change is
  that
   conversation happening between client / server and not user / client.
  
   Matt
  
   -Original Message-
   From: Prashant Kommireddi [mailto:prash1...@gmail.com]
   Sent: Wednesday, November 30, 2011 3:28 PM
   To: common-user@hadoop.apache.org
   Subject: Re: HDFS Explained as Comics
  
   Sure, its just a case of how readers interpret it.
  
 1. Client is required to specify block size and replication factor
 each
 time
 2. Client does not need to worry about it since an admin has set the
 properties in default configuration files
  
   A client could not be allowed to override the default configs if they
 are
   set final (well there are ways to go around it as well as you suggest
 by
   using create() :)
  
   The information is great and helpful. Just want to make sure a beginner
  who
   wants to write a WordCount in Mapreduce does not worry about
 specifying
   block size' and replication factor in his code.
  
   Thanks,
   Prashant
  
   On Wed, Nov 30, 2011 at 1:18 PM, maneesh varshney mvarsh...@gmail.com
   wrote:
  
Hi Prashant
   
Others may correct me if I am wrong here..
   
The client (org.apache.hadoop.hdfs.DFSClient) has a knowledge of
 block
   size
and replication factor. In the source code, I see the following in
 the
DFSClient constructor:
   
   defaultBlockSize = conf.getLong(dfs.block.size,
  DEFAULT_BLOCK_SIZE);
   
   defaultReplication = (short) conf.getInt(dfs.replication, 3);
   
My understanding is that the client considers the following chain for
  the
values:
1. Manual values (the long form constructor; when a user provides
 these
values)
2. Configuration file values (these are cluster level defaults:
dfs.block.size and dfs.replication)
3. Finally, the hardcoded values (DEFAULT_BLOCK_SIZE and 3)
   
Moreover, in the org.apache.hadoop.hdfs.protocool.ClientProtocol the
  API
   to
create a file is
void create(, short replication, long blocksize);
   
I presume it means that the client already has knowledge of these
  values
and passes them to the NameNode when creating a new file.
   
Hope that helps.
   
thanks
-Maneesh
   
On Wed, Nov 30, 2011 at 1:04 PM, Prashant Kommireddi 
   prash1...@gmail.com
wrote:
   
 Thanks Maneesh.

 Quick question, does a client really need to know Block size and
 replication factor - A lot of times client has no control over
 these
   (set
 at cluster level)

 -Prashant Kommireddi

 On Wed, Nov 30, 2011 at 12:51 PM, Dejan Menges 
  dejan.men...@gmail.com
 wrote:

  Hi Maneesh,
 
  Thanks a lot for this! Just distributed it over the team and
  comments
are
  great :)
 
  Best regards,
  Dejan
 
  On Wed, Nov 30, 2011 at 9:28 PM, maneesh varshney 
   mvarsh...@gmail.com
  wrote:
 
   For your reading pleasure!
  
   PDF 3.3MB uploaded at (the mailing list has a cap of 1MB
attachments):
  
  
 

   
  
 
 https://docs.google.com/open?id=0B-zw6KHOtbT4MmRkZWJjYzEtYjI3Ni00NTFjLWE0OGItYTU5OGMxYjc0N2M1
  
  
   Appreciate if you can spare some time to peruse this little
experiment
 of
   mine to use Comics as a medium to explain computer science
  topics.
This
   particular issue explains the protocols and internals of HDFS.
  
   I am eager to hear your opinions on the usefulness of this
 visual
   

Re: Issue with DistributedCache

2011-11-24 Thread Alexander C.H. Lorenz
Hi,

a typo?
import com.bejoy.sampels.worcount.WordCountDriver;
= wor_d_count ?

- alex

On Thu, Nov 24, 2011 at 3:45 PM, Bejoy Ks bejoy.had...@gmail.com wrote:

 Hi Denis
   I tried your code with out distributed cache locally and it worked
 fine for me. Please find it at
 http://pastebin.com/ki175YUx

 I echo Mike's words in submitting a map reduce jobs remotely. The remote
 machine can be your local PC or any utility server as Mike specified. What
 you need to have in remote machine is a replica of hadoop jars and
 configuration files same as that of your hadoop cluster. (If you don't have
 a remote util server set up then you can use your dev machine for the
 same). Just trigger the hadoop job  on local machine and the actual job
 would be submitted and running on your cluster based on the NN host and
 configuration parameters you have on your config files.

 Hope it helps!..

 Regards
 Bejoy.K.S

 On Thu, Nov 24, 2011 at 7:09 PM, Michel Segel michael_se...@hotmail.com
 wrote:

  Denis...
 
  Sorry, you lost me.
 
  Just to make sure we're using the same terminology...
  The cluster is comprised of two types of nodes...
  The data nodes which run DN,TT, and if you have HBase, RS.
  Then there are control nodes which run you NN,SN, JT and if you run
 HBase,
  HM and ZKs ...
 
  Outside of the cluster we have machines set up with Hadoop installed but
  are not running any of the processes. They are where our users launch
 there
  jobs. We call them edge nodes. ( it's not a good idea to let users
 directly
  on the actual cluster.)
 
  Ok, having said all of that... You launch you job from the edge nodes...
  Your data sits in HDFS so you don't need distributed cache at all. Does
  that make sense?
  You job will run on the local machine, connect to the JT and then run.
 
  We set up the edge nodes so that all of the jars, config files are
 already
  set up for the users and we can better control access...
 
  Sent from a remote device. Please excuse any typos...
 
  Mike Segel
 
  On Nov 24, 2011, at 7:22 AM, Denis Kreis de.kr...@gmail.com wrote:
 
   Without using the distributed cache i'm getting the same error. It's
   because i start the job from a remote client / programmatically
  
   2011/11/24 Michel Segel michael_se...@hotmail.com:
   Silly question... Why do you need to use the distributed cache for the
  word count program?
What are you trying to accomplish?
  
   I've only had to play with it for one project where we had to push out
  a bunch of c++ code to the nodes as part of a job...
  
   Sent from a remote device. Please excuse any typos...
  
   Mike Segel
  
   On Nov 24, 2011, at 7:05 AM, Denis Kreis de.kr...@gmail.com wrote:
  
   Hi Bejoy
  
   1. Old API:
   The Map and Reduce classes are the same as in the example, the main
   method is as follows
  
   public static void main(String[] args) throws IOException,
   InterruptedException {
  UserGroupInformation ugi =
   UserGroupInformation.createProxyUser(remote user name,
   UserGroupInformation.getLoginUser());
  ugi.doAs(new PrivilegedExceptionActionVoid() {
  public Void run() throws Exception {
  
  JobConf conf = new JobConf(WordCount.class);
  conf.setJobName(wordcount);
  
  conf.setOutputKeyClass(Text.class);
  conf.setOutputValueClass(IntWritable.class);
  
  conf.setMapperClass(Map.class);
  conf.setCombinerClass(Reduce.class);
  conf.setReducerClass(Reduce.class);
  
  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);
  
  FileInputFormat.setInputPaths(conf, new Path(path to
  input dir));
  FileOutputFormat.setOutputPath(conf, new Path(path
 to
   output dir));
  
  conf.set(mapred.job.tracker, ip:8021);
  
  FileSystem fs = FileSystem.get(new
  URI(hdfs://ip:8020),
   new Configuration());
  fs.mkdirs(new Path(remote path));
  fs.copyFromLocalFile(new Path(local
 path/test.jar),
  new
   Path(remote path));
  
  
  
  
 




-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*


new LAB VM online

2011-11-19 Thread Alexander C.H. Lorenz
Hi,

I created a new testing environment as VirtualBox - Image. Contains 4
Servers, CDH3u2, hBase, hive, Stargate, sqoop. I use them for testing, I
dont know if anyone will use them too. The image has around 4 GB and will
deploy 4 server with 40GB HDD. I wrote a site about in my blog.
I think for new users the setup could be very helpfull (I use them in my
Macbook when I travel, and it works)

best,
 Alex

-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*


Re: how to start tasktracker only on single port

2011-11-13 Thread Alexander C.H. Lorenz
Hi,

please explain the reason to kill (I assume kill -9) a tasktracker. The
best way is to use the start / stop scripts.

best,
 Alex

On Mon, Nov 14, 2011 at 8:39 AM, mohmmadanis moulavi 
anis_moul...@yahoo.co.in wrote:

 Hello,



 Friends I am using Hadoop 0.20.2 version,
 My problem is whenever I kill the tasktracker and start it again,
 jobtrakcer shows one extra tasktracker (the one which is killed  the other
 which has started afterwords)
 I want to do it like this,
 Whenever I kill the tasktracker it will stop sending the heartbeats, but
 when I again start tasktracker, It should start again sending  heartbeats,
 i.e it should start that tasktrakcer on same port as that of before,
 what changes should I made in configuration parameters for that, please
 let me know it


 Thanks  Regads,
 Mohmmadanis Moulavi




-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*


Re: OceanSync

2011-11-08 Thread Alexander C.H. Lorenz
Hi,

http://www.cloudera.com/products-services/scm-express/

works. I didn't know OceanSync, sorry.

- Alex

On Tue, Nov 8, 2011 at 8:14 AM, DesignerSmoke designersm...@yahoo.comwrote:


 Does anyone have BETA access into the OceanSync's Hadoop Management
 Software?
 The website is http://www.oceansync.com and sourceforge page is
 http://sourceforge.net/p/oceansync/

 Is there other software like this somewhere?
 --
 View this message in context:
 http://old.nabble.com/OceanSync-tp32801476p32801476.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*


Re: correct way to reserve space

2011-10-27 Thread Alexander C.H. Lorenz
Hi,

in hdfs-site.xml:

property
  namedfs.datanode.du.reserved/name
  value VALUE_HERE (maybe 300) /value
/property

property
  namedfs.datanode.du.pct/name
  value0.9f/value
/property

But not available for each partition, its a general parameter. Correct me
please if I wrong.

regards,
 Alex



On Thu, Oct 27, 2011 at 2:24 AM, Rita rmorgan...@gmail.com wrote:

 What is the correct way to reserve space for hdfs?

 I currently have 2 filesystem, /fs1 and /fs2 and I would like to reserve
 space for non-dfs operations. For example, for /fs1 i would like to reserve
 30gb of space for non-dfs and 10gb of space for /fs2 ?


 I fear HADOOP-2991 is still haunting us?

 I am using CDH 3U1




 --
 --- Get your facts first, then you can distort them as you please.--




-- 
Alexander Lorenz
http://mapredit.blogspot.com


Re: correct way to reserve space

2011-10-27 Thread Alexander C.H. Lorenz
Hi Harsh,

ah, nice to know, thanks ;)

best,
 Alex

On Thu, Oct 27, 2011 at 9:27 AM, Harsh J ha...@cloudera.com wrote:

 The percentage opt is not valid on 0.20 iirc.

 On Thursday, October 27, 2011, Alexander C.H. Lorenz 
 wget.n...@googlemail.com wrote:
  Hi,
 
  in hdfs-site.xml:
 
  property
   namedfs.datanode.du.reserved/name
   value VALUE_HERE (maybe 300) /value
  /property
 
  property
   namedfs.datanode.du.pct/name
   value0.9f/value
  /property
 
  But not available for each partition, its a general parameter. Correct me
  please if I wrong.
 
  regards,
   Alex
 
 
 
  On Thu, Oct 27, 2011 at 2:24 AM, Rita rmorgan...@gmail.com wrote:
 
  What is the correct way to reserve space for hdfs?
 
  I currently have 2 filesystem, /fs1 and /fs2 and I would like to reserve
  space for non-dfs operations. For example, for /fs1 i would like to
 reserve
  30gb of space for non-dfs and 10gb of space for /fs2 ?
 
 
  I fear HADOOP-2991 is still haunting us?
 
  I am using CDH 3U1
 
 
 
 
  --
  --- Get your facts first, then you can distort them as you please.--
 
 
 
 
  --
  Alexander Lorenz
  http://mapredit.blogspot.com
 

 --
 Harsh J




-- 
Alexander Lorenz
http://mapredit.blogspot.com


Re: running sqoop on hadoop cluster

2011-10-21 Thread Alexander C.H. Lorenz
Hi,

first setup a valid cluster:
namenode, secondary namenode, jobtracker + datanodes with tasktracker.

After that install sqoop on a datanode and play with ;)

Here a howto for RedHat (CentOS)
http://mapredit.blogspot.com/p/get-hadoop-cluster-running-in-20.html

and for Ubuntu:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

regards,
 Alex

On Fri, Oct 21, 2011 at 2:03 AM, firantika firantika.agust...@gmail.comwrote:


 Hi All,
 i'm newbie on hadoop,

 if i installed hadoop on 2 node, where is hdfs running ? on master or slave
 node ?

 and then if i running sqoop for export dbms to hive, is it give effect on
 speed up system between hadoop which running on single node and hadoop
 multi
 node ?

 please give me explaining ?


 Tks


 --
 View this message in context:
 http://old.nabble.com/running-sqoop-on-hadoop-cluster-tp32693398p32693398.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




-- 
Alexander Lorenz
http://mapredit.blogspot.com


Re: jobtracker cannot be started

2011-10-21 Thread Alexander C.H. Lorenz
Hi,

what are the heap size you given at the jobtracker? And how much jobs /
users / tasks are run? What say a log?
Turn on GC logging:
http://java.sun.com/developer/technicalArticles/Programming/GCPortal/

- Alex


On Fri, Oct 21, 2011 at 9:47 AM, Peng, Wei wei.p...@xerox.com wrote:

 Hi,



 When I was running a job on hadoop with 75% mappers finished, the
 jobtracker hung so that I cannot access
 jobtrackerserver:7845/jobtracker.jsp and hadoop job -status hung as
 well.



 Then I stopped jobtracker and restarted it. However, the jobtracker
 cannot be started. I received error message from jobtracker.log.out
 saying



 Exception in thread LeaseChecker java.lang.OutOfMemoryError: Java heap
 space

at
 java.io.BufferedOutputStream.init(BufferedOutputStream.java:59)

at
 java.io.BufferedOutputStream.init(BufferedOutputStream.java:42)

at
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:318)

at
 org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)

at org.apache.hadoop.ipc.Client.getConnection(Client.java:859)

at org.apache.hadoop.ipc.Client.call(Client.java:719)

at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)

at $Proxy4.renewLease(Unknown Source)

at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)

at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
 Impl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
 cationHandler.java:82)

at
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
 Handler.java:59)

at $Proxy4.renewLease(Unknown Source)

at
 org.apache.hadoop.hdfs.DFSClient$LeaseChecker.renew(DFSClient.java:1016)

at
 org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1028)

at java.lang.Thread.run(Thread.java:619)

 Exception in thread expireTrackers java.lang.OutOfMemoryError: Java
 heap space

at java.util.Arrays.copyOf(Arrays.java:2882)

at
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.jav
 a:100)

at
 java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)

at java.lang.StringBuffer.append(StringBuffer.java:224)

at org.apache.hadoop.mapred.JobHistory.log(JobHistory.java:354)

at
 org.apache.hadoop.mapred.JobHistory$MapAttempt.logStarted(JobHistory.jav
 a:1354)

at
 org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:233
 2)

at
 org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.ja
 va:849)

at
 org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:246
 3)

at
 org.apache.hadoop.mapred.JobTracker.lostTaskTracker(JobTracker.java:3474
 )

at
 org.apache.hadoop.mapred.JobTracker$ExpireTrackers.run(JobTracker.java:3
 48)

at java.lang.Thread.run(Thread.java:619)

 Exception in thread IPC Server listener on 9001
 java.lang.OutOfMemoryError: Java heap space

 java.lang.reflect.InvocationTargetException

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
 a:39)

at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
 Impl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.mortbay.log.Slf4jLog.warn(Slf4jLog.java:126)

at org.mortbay.log.Log.warn(Log.java:181)

at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:449)

at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
 16)

at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)

at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)

at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)

at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
 Collection.java:230)

at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)

at org.mortbay.jetty.Server.handle(Server.java:324)

at
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)

at
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
 ction.java:864)

at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)

at
 org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)

at
 org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)

at
 org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:
 409)

at
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java
 :522)

 Caused by: java.lang.OutOfMemoryError: Java heap space

 

Re: jobtracker cannot be started

2011-10-21 Thread Alexander C.H. Lorenz
add into hadoop-env.sh the opts to the jdk-call. The logs should be
accessible at he hadoop-log-directory.

Also check http://jobtracker:50030/stacks - thats the same as jstack (jstack
PID). Also you can use jstack -F PID to get a corefile (similar to /stacks I
think) @jobtracker.

Are you using 64bit-JDK? Which version?

regards,
 Alex

On Fri, Oct 21, 2011 at 10:00 AM, Peng, Wei wei.p...@xerox.com wrote:

 I am using the default heap size, which is 1000MB. The jobtracker hung
 when only I was running one job. Now I could not even restart the
 jobtracker.
 Can you teach me how to turn on GC logging in hadoop?

 Thanks!
 Wei

 -Original Message-
 From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
 Sent: Friday, October 21, 2011 3:54 AM
 To: common-user@hadoop.apache.org
 Subject: Re: jobtracker cannot be started

 Hi,

 what are the heap size you given at the jobtracker? And how much jobs /
 users / tasks are run? What say a log?
 Turn on GC logging:
 http://java.sun.com/developer/technicalArticles/Programming/GCPortal/

 - Alex


 On Fri, Oct 21, 2011 at 9:47 AM, Peng, Wei wei.p...@xerox.com wrote:

  Hi,
 
 
 
  When I was running a job on hadoop with 75% mappers finished, the
  jobtracker hung so that I cannot access
  jobtrackerserver:7845/jobtracker.jsp and hadoop job -status hung as
  well.
 
 
 
  Then I stopped jobtracker and restarted it. However, the jobtracker
  cannot be started. I received error message from jobtracker.log.out
  saying
 
 
 
  Exception in thread LeaseChecker java.lang.OutOfMemoryError: Java
 heap
  space
 
 at
  java.io.BufferedOutputStream.init(BufferedOutputStream.java:59)
 
 at
  java.io.BufferedOutputStream.init(BufferedOutputStream.java:42)
 
 at
 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:318)
 
 at
  org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
 
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:859)
 
 at org.apache.hadoop.ipc.Client.call(Client.java:719)
 
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 
 at $Proxy4.renewLease(Unknown Source)
 
 at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
 
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
  Impl.java:25)
 
 at java.lang.reflect.Method.invoke(Method.java:597)
 
 at
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvo
  cationHandler.java:82)
 
 at
 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocation
  Handler.java:59)
 
 at $Proxy4.renewLease(Unknown Source)
 
 at
 
 org.apache.hadoop.hdfs.DFSClient$LeaseChecker.renew(DFSClient.java:1016)
 
 at
  org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1028)
 
 at java.lang.Thread.run(Thread.java:619)
 
  Exception in thread expireTrackers java.lang.OutOfMemoryError: Java
  heap space
 
 at java.util.Arrays.copyOf(Arrays.java:2882)
 
 at
 
 java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.jav
  a:100)
 
 at
  java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390)
 
 at java.lang.StringBuffer.append(StringBuffer.java:224)
 
 at org.apache.hadoop.mapred.JobHistory.log(JobHistory.java:354)
 
 at
 
 org.apache.hadoop.mapred.JobHistory$MapAttempt.logStarted(JobHistory.jav
  a:1354)
 
 at
 
 org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:233
  2)
 
 at
 
 org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.ja
  va:849)
 
 at
 
 org.apache.hadoop.mapred.JobInProgress.failedTask(JobInProgress.java:246
  3)
 
 at
 
 org.apache.hadoop.mapred.JobTracker.lostTaskTracker(JobTracker.java:3474
  )
 
 at
 
 org.apache.hadoop.mapred.JobTracker$ExpireTrackers.run(JobTracker.java:3
  48)
 
 at java.lang.Thread.run(Thread.java:619)
 
  Exception in thread IPC Server listener on 9001
  java.lang.OutOfMemoryError: Java heap space
 
  java.lang.reflect.InvocationTargetException
 
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 
 at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
  a:39)
 
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
  Impl.java:25)
 
 at java.lang.reflect.Method.invoke(Method.java:597)
 
 at org.mortbay.log.Slf4jLog.warn(Slf4jLog.java:126)
 
 at org.mortbay.log.Log.warn(Log.java:181)
 
 at
 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:449)
 
 at
 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
  16)
 
 at
 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 
 at
 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766

Re: jobtracker cannot be started

2011-10-21 Thread Alexander C.H. Lorenz
)

 org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.ja
 va:429)

 org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:185)

 org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnect
 or.java:124)

 org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:
 707)

 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java
 :522)
 Thread 22
 (org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor@67c7
 980c):
  State: TIMED_WAITING
  Blocked count: 5
  Waited count: 127
  Stack:
java.lang.Thread.sleep(Native Method)

 org.apache.hadoop.hdfs.server.namenode.DecommissionManager$Monitor.run(D
 ecommissionManager.java:65)
java.lang.Thread.run(Thread.java:619)
 Thread 21
 (org.apache.hadoop.hdfs.server.namenode.FSNamesystem$ReplicationMonitor@
 2094257f):
  State: TIMED_WAITING
  Blocked count: 20
  Waited count: 1263
  Stack:
java.lang.Thread.sleep(Native Method)

 org.apache.hadoop.hdfs.server.namenode.FSNamesystem$ReplicationMonitor.r
 un(FSNamesystem.java:2304)
java.lang.Thread.run(Thread.java:619)
 Thread 20
 (org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@3a51127a):
  State: TIMED_WAITING
  Blocked count: 21
  Waited count: 1875
  Stack:
java.lang.Thread.sleep(Native Method)

 org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseMan
 ager.java:349)
java.lang.Thread.run(Thread.java:619)
 Thread 19
 (org.apache.hadoop.hdfs.server.namenode.FSNamesystem$HeartbeatMonitor@61
 578aab):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 13
  Stack:
java.lang.Thread.sleep(Native Method)

 org.apache.hadoop.hdfs.server.namenode.FSNamesystem$HeartbeatMonitor.run
 (FSNamesystem.java:2286)
java.lang.Thread.run(Thread.java:619)
 Thread 18
 (org.apache.hadoop.hdfs.server.namenode.PendingReplicationBlocks$Pending
 ReplicationMonitor@2339e351):
  State: TIMED_WAITING
  Blocked count: 0
  Waited count: 13
  Stack:
java.lang.Thread.sleep(Native Method)

 org.apache.hadoop.hdfs.server.namenode.PendingReplicationBlocks$PendingR
 eplicationMonitor.run(PendingReplicationBlocks.java:186)
java.lang.Thread.run(Thread.java:619)
 Thread 9 (RMI TCP Accept-0):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
java.net.PlainSocketImpl.socketAccept(Native Method)
java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
java.net.ServerSocket.implAccept(ServerSocket.java:453)
java.net.ServerSocket.accept(ServerSocket.java:421)

 sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTrans
 port.java:369)

 sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341)
java.lang.Thread.run(Thread.java:619)
 Thread 4 (Signal Dispatcher):
  State: RUNNABLE
  Blocked count: 0
  Waited count: 0
  Stack:
 Thread 3 (Finalizer):
  State: WAITING
  Blocked count: 0
  Waited count: 40
  Waiting on java.lang.ref.ReferenceQueue$Lock@22f62eba
  Stack:
java.lang.Object.wait(Native Method)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
 Thread 2 (Reference Handler):
  State: WAITING
  Blocked count: 1
  Waited count: 39
  Waiting on java.lang.ref.Reference$Lock@646d6aa0
  Stack:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
 Thread 1 (main):
  State: WAITING
  Blocked count: 10
  Waited count: 10
  Waiting on org.apache.hadoop.ipc.RPC$Server@41f6321
  Stack:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:485)
org.apache.hadoop.ipc.Server.join(Server.java:1122)

 org.apache.hadoop.hdfs.server.namenode.NameNode.join(NameNode.java:292)

 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:966)

 Wei

 -Original Message-
 From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
 Sent: Friday, October 21, 2011 4:15 AM
 To: common-user@hadoop.apache.org
 Subject: Re: jobtracker cannot be started

 add into hadoop-env.sh the opts to the jdk-call. The logs should be
 accessible at he hadoop-log-directory.

 Also check http://jobtracker:50030/stacks - thats the same as jstack
 (jstack
 PID). Also you can use jstack -F PID to get a corefile (similar to
 /stacks I
 think) @jobtracker.

 Are you using 64bit-JDK? Which version?

 regards,
  Alex

 On Fri, Oct 21, 2011 at 10:00 AM, Peng, Wei wei.p...@xerox.com wrote:

  I am using the default heap size, which is 1000MB. The jobtracker hung
  when only I was running one job. Now I could not even restart the
  jobtracker.
  Can you teach me how to turn on GC logging in hadoop?
 
  Thanks!
  Wei
 
  -Original Message-
  From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
  Sent: Friday, October 21, 2011 3:54 AM
  To: common-user@hadoop.apache.org
  Subject: Re: jobtracker

Re: jobtracker cannot be started

2011-10-21 Thread Alexander C.H. Lorenz
should, yes ;) I use 2000 in our environment, but depends on the memory on
your servers.

regards,
 Alex

On Fri, Oct 21, 2011 at 10:58 AM, Peng, Wei wei.p...@xerox.com wrote:

 Yes, the heap size the default 1000m. /bin/java -Xmx1000m
 So if I can change the heapsize to be bigger, I should be able to solve
 this problem?

 Thanks,
 Wei

 -Original Message-
 From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
 Sent: Friday, October 21, 2011 4:53 AM
 To: common-user@hadoop.apache.org
 Subject: Re: jobtracker cannot be started

 looks like that the Heap utilization has exceeded the value set by -Xmx.
 Do
 a ps waux|grep java @jobtracker  |grep -i xmx
 The heapsize will be set in hadoop-env.sh:
 export HADOOP_HEAPSIZE=

 default 1000, I think.

 - alex

 On Fri, Oct 21, 2011 at 10:31 AM, Peng, Wei wei.p...@xerox.com wrote:

  Thank you for your quick reply!!
 
  I cannot change the hadoop conf files because they are owned by a
 person
  who has left the company, though I have the root access. My Java
 version
  is java version 1.5.0_07
  Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_07-b03)
  Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_07-b03, mixed mode)
 
  The log on http://jobtracker:50030/stacks is
  Process Thread Dump:
  26 active threads
  Thread 53 (1424598978@qtp0-5):
   State: RUNNABLE
   Blocked count: 0
   Waited count: 29
   Stack:
 sun.management.ThreadImpl.getThreadInfo0(Native Method)
 sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147)
 sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:123)
 
 
 org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.j
  ava:149)
 
 
 org.apache.hadoop.http.HttpServer$StackServlet.doGet(HttpServer.java:505
  )
 javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 
  org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
 
 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
 
 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
  16)
 
 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 
 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
 
  org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
 
 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
  Collection.java:230)
 
 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 org.mortbay.jetty.Server.handle(Server.java:324)
 
 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
 
 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
  ction.java:864)
 org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
 org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
  Thread 43 (Trash Emptier):
   State: TIMED_WAITING
   Blocked count: 0
   Waited count: 183
   Stack:
 java.lang.Thread.sleep(Native Method)
 org.apache.hadoop.fs.Trash$Emptier.run(Trash.java:234)
 java.lang.Thread.run(Thread.java:619)
  Thread 36 (IPC Server handler 9 on 9000):
   State: WAITING
   Blocked count: 32
   Waited count: 3444
   Waiting on
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49
  59d87f
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
 
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw
  ait(AbstractQueuedSynchronizer.java:1925)
 
 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3
  58)
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:939)
  Thread 35 (IPC Server handler 8 on 9000):
   State: WAITING
   Blocked count: 29
   Waited count: 3446
   Waiting on
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49
  59d87f
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
 
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw
  ait(AbstractQueuedSynchronizer.java:1925)
 
 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3
  58)
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:939)
  Thread 34 (IPC Server handler 7 on 9000):
   State: WAITING
   Blocked count: 30
   Waited count: 3451
   Waiting on
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49
  59d87f
   Stack:
 sun.misc.Unsafe.park(Native Method)
 java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
 
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw
  ait(AbstractQueuedSynchronizer.java:1925)
 
 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3
  58)
 org.apache.hadoop.ipc.Server$Handler.run(Server.java:939)
  Thread 33 (IPC Server handler 6 on 9000):
   State

Re: jobtracker cannot be started

2011-10-21 Thread Alexander C.H. Lorenz
How large is the blocksize? Hadoop uses a buffer size of 4kb for IO
operations, I can imagine that the balancer take a lot of memory. Check if
they running. Also a good idea will be to increase the space (add more
datanodes). Check the fs.trash.interval setting or not (no settings disable
the trash-facility).
also you can use # hadoop dfs -expunge

And, at least, check the limits on the server (limits.conf)

- alex


On Fri, Oct 21, 2011 at 12:09 PM, Peng, Wei wei.p...@xerox.com wrote:

 Alex, thank you a lot for helping me. I will figure out how to change
 the conf file. It seems that even chattr -i does not work.
 Just one last question, why restarting the jobtracker needs such a big
 heap size? I had no problem to restart it before the jobtracker hung ?
 One problem of this hadoop cluster that I did not mention is the DFS
 space only has 5% left.

 Thanks
 Wei

 -Original Message-
 From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
 Sent: Friday, October 21, 2011 5:01 AM
 To: common-user@hadoop.apache.org
 Subject: Re: jobtracker cannot be started

 should, yes ;) I use 2000 in our environment, but depends on the memory
 on
 your servers.

 regards,
  Alex

 On Fri, Oct 21, 2011 at 10:58 AM, Peng, Wei wei.p...@xerox.com wrote:

  Yes, the heap size the default 1000m. /bin/java -Xmx1000m
  So if I can change the heapsize to be bigger, I should be able to
 solve
  this problem?
 
  Thanks,
  Wei
 
  -Original Message-
  From: Alexander C.H. Lorenz [mailto:wget.n...@googlemail.com]
  Sent: Friday, October 21, 2011 4:53 AM
  To: common-user@hadoop.apache.org
  Subject: Re: jobtracker cannot be started
 
  looks like that the Heap utilization has exceeded the value set by
 -Xmx.
  Do
  a ps waux|grep java @jobtracker  |grep -i xmx
  The heapsize will be set in hadoop-env.sh:
  export HADOOP_HEAPSIZE=
 
  default 1000, I think.
 
  - alex
 
  On Fri, Oct 21, 2011 at 10:31 AM, Peng, Wei wei.p...@xerox.com
 wrote:
 
   Thank you for your quick reply!!
  
   I cannot change the hadoop conf files because they are owned by a
  person
   who has left the company, though I have the root access. My Java
  version
   is java version 1.5.0_07
   Java(TM) 2 Runtime Environment, Standard Edition (build
 1.5.0_07-b03)
   Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_07-b03, mixed mode)
  
   The log on http://jobtracker:50030/stacks is
   Process Thread Dump:
   26 active threads
   Thread 53 (1424598978@qtp0-5):
State: RUNNABLE
Blocked count: 0
Waited count: 29
Stack:
  sun.management.ThreadImpl.getThreadInfo0(Native Method)
  sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:147)
  sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:123)
  
  
 
 org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.j
   ava:149)
  
  
 
 org.apache.hadoop.http.HttpServer$StackServlet.doGet(HttpServer.java:505
   )
  javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
  javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
  
  
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
  
  
 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
  
  
 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:2
   16)
  
  
 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
  
  
 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
  
  
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
  
  
 
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandler
   Collection.java:230)
  
  
 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
  org.mortbay.jetty.Server.handle(Server.java:324)
  
  
 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
  
  
 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConne
   ction.java:864)
  org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
  org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
   Thread 43 (Trash Emptier):
State: TIMED_WAITING
Blocked count: 0
Waited count: 183
Stack:
  java.lang.Thread.sleep(Native Method)
  org.apache.hadoop.fs.Trash$Emptier.run(Trash.java:234)
  java.lang.Thread.run(Thread.java:619)
   Thread 36 (IPC Server handler 9 on 9000):
State: WAITING
Blocked count: 32
Waited count: 3444
Waiting on
  
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@49
   59d87f
Stack:
  sun.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
  
  
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.aw
   ait(AbstractQueuedSynchronizer.java:1925)
  
  
 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:3
   58)
  org.apache.hadoop.ipc.Server$Handler.run(Server.java:939)
   Thread 35 (IPC