Re: XML parsing in Hadoop

2013-11-27 Thread Mirko Kämpf
Chhaya, did you run the same code in stand alone mode without MapReduce framework? How long takes the code in you map() function standalone? Compare those two different times (t_0 MR mode, t_1 standalone mode) to find out if it is a MR issue or something which comes from the xml-parser logic or th

XML parsing in Hadoop

2013-11-27 Thread Chhaya Vishwakarma
Hi, The below code parses XML file, Here the output of the code is correct but the job takes long time for completion. It took 20 hours to parse 2MB file. Kindly suggest what changes could be done to increase the performance. package xml; import java.io.FileInputStream; import java.io.FileNo

RE: Suddenly NameNode stopped responding

2013-11-27 Thread Sandeep L
Hi, Thanks for update. After spending quite a bit of time on Hadoop/HBase I couldn't find any thing awkward in logs. At last what I got to know is the reason for outage is IO Error thrown by the one of disk in which we are storing NameNode files. One more suggestion we need is regarding NameNod

Re: Start and Stop Namenode

2013-11-27 Thread Ascot Moss
Hi, yes, I find the reason because of the following issue. 'org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/hadoop/yarn/yarn_data/tmp/dfs/name is in an inconsistent state" Formatted the HDFS again and fixed the issue. jps 3774 Jps 3701 Na

Re: Start and Stop Namenode

2013-11-27 Thread Harsh J
Yes you should expect to see a NameNode separately available but apparently its dying out. Check the NN's log on that machine to see why. On Thu, Nov 28, 2013 at 8:37 AM, Ascot Moss wrote: > Hi, > > I am new to 2.2.0, after running the following command to start the first > namenode, I used jps t

Re: what is the use of YARN proxyserver?

2013-11-27 Thread Jian He
This link http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.htmlexplains some of it. Jian On Wed, Nov 27, 2013 at 7:47 PM, Henry Hung wrote: > Hi All, > > > > Could someone explain to me what is the use of YARN proxyserver? > > > > I ask this because appar

what is the use of YARN proxyserver?

2013-11-27 Thread Henry Hung
Hi All, Could someone explain to me what is the use of YARN proxyserver? I ask this because apparently the map reduce can be execute and complete without starting the proxyserver. Best regards, Henry Hung The privileged confidential information contained in thi

Re: Capacity Scheduler Issue

2013-11-27 Thread Munna
I have set: *yarn.scheduler.capacity.maximum-am-resource-percent=0.1* what is the best value? Tx, Munna On Thu, Nov 28, 2013 at 12:34 AM, Jian He wrote: > The log shows the both queues are properly picked up by the RM. > If the problem is that your submitted application is not able to run, y

Start and Stop Namenode

2013-11-27 Thread Ascot Moss
Hi, I am new to 2.2.0, after running the following command to start the first namenode, I used jps to check the cluster: ./sbin/hadoop-daemon.sh --script hdfs start namenode starting namenode, logging to /usr/local/hadoop/yarn/hadoop//logs/hadoop-hduser-namenode-hd01.emblocsoft.net.ou

Re: Error for larger jobs

2013-11-27 Thread Ted Yu
Siddharth : Take a look at 2.1.2.5. ulimit and nproc under http://hbase.apache.org/book.html#os Cheers On Wed, Nov 27, 2013 at 6:04 PM, Azuryy Yu wrote: > yes. you need to increase it, a simple way is put it in your /etc/profile > > > > > On Thu, Nov 28, 2013 at 9:59 AM, Siddharth Tiwari < >

RE: Error for larger jobs

2013-11-27 Thread Siddharth Tiwari
What shall I put in my bash_profile ? Date: Thu, 28 Nov 2013 10:04:58 +0800 Subject: Re: Error for larger jobs From: azury...@gmail.com To: user@hadoop.apache.org yes. you need to increase it, a simple way is put it in your /etc/profile On Thu, Nov 28, 2013 at 9:59 AM, Siddharth Tiwari wrote

Re: Error for larger jobs

2013-11-27 Thread Azuryy Yu
yes. you need to increase it, a simple way is put it in your /etc/profile On Thu, Nov 28, 2013 at 9:59 AM, Siddharth Tiwari wrote: > Hi Vinay and Azuryy > Thanks for your responses. > I get these error when I just run a teragen. > Also, do you suggest me to increase nproc value ? What should

Re: Error for larger jobs

2013-11-27 Thread Siddharth Tiwari
Hi Vinay and Azuryy Thanks for your responses. I get these error when I just run a teragen. Also, do you suggest me to increase nproc value ? What should I increase it to ? Sent from my iPad > On Nov 27, 2013, at 11:08 PM, "Vinayakumar B" > wrote: > > Hi Siddharth, > > Looks like the issue w

Re: Error for larger jobs

2013-11-27 Thread Azuryy Yu
Siddharth, please check 'mapred.local.dir', but I would like advice you check GC logs and OS logs. pay more attention on OS logs. I suspect you start too many threads concurrently, then consumed all OS avaliable resources. On Thu, Nov 28, 2013 at 9:08 AM, Vinayakumar B wrote: > Hi Siddharth, >

RE: Error for larger jobs

2013-11-27 Thread Vinayakumar B
Hi Siddharth, Looks like the issue with one of the machine. Or its happening in different machines also? I don't think it's a problem with JVM heap memory. Suggest you to check this once, http://stackoverflow.com/questions/8384000/java-io-ioexception-error-11 Thanks and Regards, Vinayakumar B

RE: Error for larger jobs

2013-11-27 Thread Siddharth Tiwari
Hi Azury Thanks for response. I have plenty of space on my Disks so that cannot be the issue. ** Cheers !!! Siddharth Tiwari Have a refreshing day !!! "Every duty is holy, and devotion to duty is the highest form of worship of God.” "Maybe other people will try to li

Re: Error for larger jobs

2013-11-27 Thread Azuryy Yu
Your disk is full from the log. On 2013-11-28 5:27 AM, "Siddharth Tiwari" wrote: > Hi Team > > I am getting following strange error, can you point me to the possible > reason. > I have set heap size to 4GB but still getting it. please help > > *syslog logs* > > 2013-11-27 19:01:50,678 WARN org.ap

Error=11 resource temporarily unavailable

2013-11-27 Thread Siddharth Tiwari
Hi team I am getting this strange error below is the trace 2013-11-27 19:01:50,678 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable2013-11-27 19:01:51,051 WARN mapreduce.Counters: Group org.apache.

Re: NN stopped and cannot recover with error "There appears to be a gap in the edit log"

2013-11-27 Thread Adam Kawa
Maybe you can play with the "offline edits viewer". I have never run into such an issue, this I have never been playing with "offline edits viewer" on production datasets, but it has some options that could be perhaps useful when troubleshooting and fixing. [kawaa@localhost Desktop]$ hdfs oev Usag

Re: org.apache.hadoop.mapred.TaskTracker: Caught exception: java.net.UnknownHostException: Invalid host name:

2013-11-27 Thread Adam Kawa
As far as I remember (we might run into such a issue ~6 months ago), the TaskTracker can cache the hostname of JobTracker. Try to restart a TaskTrackers, to check if it connects correctly. Please let me know, if restart of TT helped. 2013/11/15 kumar y > > Hi, > > we changed the jobtracker name

Re: mapred.tasktacker.reduce.tasks.maximum issue

2013-11-27 Thread Adam Kawa
It looks that you have a typo in the names of configuration properties, so Hadoop ignores them and uses the default vaules (2 map and 2 reduce tasks per node). it should mapred.*tasktracker*.reduce.tasks.maximum not mapred.*tasktacker*.reduce.tasks.maximum (tasktRacker, not tasktacker) - the same

Error for larger jobs

2013-11-27 Thread Siddharth Tiwari
Hi Team I am getting following strange error, can you point me to the possible reason.I have set heap size to 4GB but still getting it. please help syslog logs2013-11-27 19:01:50,678 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using bui

Re: Capacity Scheduler Issue

2013-11-27 Thread Jian He
The log shows the both queues are properly picked up by the RM. If the problem is that your submitted application is not able to run, you may try increasing yarn.scheduler.capacity.maximum-am-resource-percent, this controls the max number of concurrently running AMs in the cluster. Jian On Wed,

Re: Uncompressed size of Sequence files

2013-11-27 Thread Robert Dyer
I should probably mention my attempt to use the 'hadoop' command for this task fails (this file is fairly large, about 80GB compressed): $ HADOOP_HEAPSIZE=3000 hadoop fs -text /path/to/file | wc -c Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.lang.StringCoding$Str

Capacity Scheduler Issue

2013-11-27 Thread Munna
Hi Flocks, Since, last two days I am about to configure Capacity Scheduler. Here, how I have struggling L…. I am using Hadoop 2.0.0 and YARN 2.0.0(CDH4). Initially I have created 4 Queue’s as per the Capacity Scheduler Documentation and those queues shown in RM UI. After configuration I tr

Re: 答复: problems of FairScheduler in hadoop2.2.0

2013-11-27 Thread Sandy Ryza
Thanks for the additional info. Still not sure what could be going on. Do you notice any other suspicious LOG messages in the resourcemanager log? Are you able to show the results of /ws/v1/ cluster/scheduler? On the resourcemanager web UI, how much memory does it say is used? On Wed, Nov 27,

ISSUE with Filter Hbase Table using SingleColumnValueFilter

2013-11-27 Thread samir das mohapatra
Dear developer I am looking for a solution where i can applu the *SingleColumnValueFilter to select only the value which i will mention in the value parameter not other then the value which i will pass.* * Exxample:* SingleColumnValueFilter colValFilter = new SingleColumnValueFilter(Bytes

Get counters of retired jobs via API

2013-11-27 Thread Felix . 徐
Hi all, I can see a lot of metrics of retired jobs on the jobtracker's webpage, however, it seems that I can not get anything of those retired jobs through java api (JobClient), is there any way to do that? Thanks!

Region Server based filter using SingleColumnValueFilter is not working in CDH4.2.1 but working on CDH4.1.2

2013-11-27 Thread samir das mohapatra
Dear Hadoop/Hbase Developer, I was trying to scan the hbase table by applying *SingleColumnValueFilter ,* It workes fine in CDH4.1.2 but when same code I am running in Other Dev cluster which is not working under CDH4.2.1 , Is there any issue with version difference or it is a code level i

Re: Suddenly NameNode stopped responding

2013-11-27 Thread Bharath Vissapragada
It is difficult to guess the reason behind this outage without the logs. Can we have a look at them? (pastebin). Did you configure HA for namenode? Did it failover to standby? On Wed, Nov 27, 2013 at 10:29 AM, Sandeep L wrote: > Hi, > Couple of hours back all of sudden NameNode of our production

答复: problems of FairScheduler in hadoop2.2.0

2013-11-27 Thread 麦树荣
Hi, sorry, I complement some information. The hadoop 2.2.0 had been running normally for some days since I start up the hadoop server. I can run jobs without any problems. Today suddenly the jobs cannot run and all the jobs’ status were keeping “submitted” after submitting. There are 3 slavers

RE: Any reference for upgrade hadoop from 1.x to 2.2

2013-11-27 Thread Nirmal Kumar
Hello Sandy, The post was useful and gave an insight of the migration. I am doing a test migration from Apache Hadoop-1.2.0 to Apache Hadoop-2.0.6-alpha on a single node environment. I am having the Apache Hadoop-1.2.0 up and running. Can you please let me know the steps that one should follow

Re: problems of FairScheduler in hadoop2.2.0

2013-11-27 Thread Sandy Ryza
Hi, Can you share the contents of your fair-scheduler.xml? If you submit just a single job, does it run? What do you see if you go to /ws/v1/cluster/scheduler? -Sandy On Wed, Nov 27, 2013 at 12:09 AM, 麦树荣 wrote: > Hi, all > > > > When I run jobs in hadoop 2.2.0, I encounter a problem. Sud

problems of FairScheduler in hadoop2.2.0

2013-11-27 Thread 麦树荣
Hi, all When I run jobs in hadoop 2.2.0, I encounter a problem. Suddenly, the hadoop resourcemanager cannot work normally: When I submit jobs and the jobs’ status all are “submitted” and cannot run. I cannot find any answers in the internet, who can give me some help? Thanks. The resourcemanag