Re: Hadoop graphing tools

2013-10-14 Thread Swaroop Patra
Hi, You can use Tableau. Link -http://tableausoftware.com/ But you need hive. On Tue, Oct 15, 2013 at 10:30 AM, Jagat Singh wrote: > Hi , > > You should look at pentaho or talend tools . > > Connect via hiveserver and plot charts > > Thanks > On 15/10/2013 2:04 PM, "Xuri Nagarin" wrote: > >>

Re: Hadoop graphing tools

2013-10-14 Thread Jagat Singh
Hi , You should look at pentaho or talend tools . Connect via hiveserver and plot charts Thanks On 15/10/2013 2:04 PM, "Xuri Nagarin" wrote: > Hi, > > I am looking for some simple graphing tools to use with Hadoop (bar or > line chart). Most google searches for "hadoop graphing" turns up resul

Re: high availability

2013-10-14 Thread Jing Zhao
"it is unclear to me if the transition in this case is also rapid but the fencing takes long while the new namenode is already active, or if in this period i am stuck without an active namenode." The standby->active transition will get stuck in this period, i.e., the NN can only become active afte

Re: Improving MR job disk IO

2013-10-14 Thread Xuri Nagarin
Yep, have several tens of terabytes of data that will easily be over couple of hundred TB in a year. Now it isn't as if I have one or two use cases to run on these data sets. I need to run simple aggregation like counting, averaging to more advanced analytics. I also need to be able to search throu

Re: Hadoop graphing tools

2013-10-14 Thread Xuri Nagarin
Not really performance monitoring but a simple charting tool. Scan a data set, extract keys/values and place them on a 2-D chart. The way you would load a small data set in an Excel spreadsheet and create a bar/line/pie chart from it. On Mon, Oct 14, 2013 at 8:12 PM, Lance Norskog wrote: > Yo

Re: Improving MR job disk IO

2013-10-14 Thread Lance Norskog
There are a few reasons to use map/reduce, or just map-only or reduce-only jobs. 1) You want to do parallel algorithms where data from multiple machines have to be cross-checked. Map/Reduce allows this. 2) You want to run several instances of a job. Hadoop does this reliably by monitoring all in

Re: Hadoop graphing tools

2013-10-14 Thread Lance Norskog
You mean a performance monitoring tool? I have not used any, but you should search for that, not graph. On 10/14/2013 08:03 PM, Xuri Nagarin wrote: Hi, I am looking for some simple graphing tools to use with Hadoop (bar or line chart). Most google searches for "hadoop graphing" turns up resu

Hadoop graphing tools

2013-10-14 Thread Xuri Nagarin
Hi, I am looking for some simple graphing tools to use with Hadoop (bar or line chart). Most google searches for "hadoop graphing" turns up results for much more complex graph analysis tool like Giraph. Any simple rrdtool like solutions for Hadoop? TIA, Xuri

Re: Improving MR job disk IO

2013-10-14 Thread Xuri Nagarin
Yes, I tested with smaller data sets and the MR job correctly reads/matches one line at a time. On Fri, Oct 11, 2013 at 4:48 AM, DSuiter RDX wrote: > So, perhaps this has been thought of, but perhaps not. > > It is my understanding that grep is usually sorting things one line at a > time. As

Re: Hadoop Jobtracker cluster summary of heap size and OOME

2013-10-14 Thread Rajesh Balamohan
JT OOM??.. Refer https://issues.apache.org/jira/browse/MAPREDUCE-5508 (but JT OOM should happen over a period of time. Not immediately). On Tue, Oct 15, 2013 at 7:44 AM, Viswanathan J wrote: > Hi Arun, > > Will not cross post hereafter. > > I had the same heap size value, same no.of jobs,schedu

Re: Hadoop Jobtracker cluster summary of heap size and OOME

2013-10-14 Thread Viswanathan J
Hi Arun, Will not cross post hereafter. I had the same heap size value, same no.of jobs,scheduler and it is works fine in hadoop 1.0.4 version for 8 to 9 months, but I'm facing this JT OOME issue in hadoop 1.2.1 version only. Even though I tried to set heap size max of 16G but it eats the whole

Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-14 Thread Viswanathan J
Hi, Not yet updated in production environment. Will keep you posted once it is done. In which Apache hadoop release this issue will be fixed? Or this issue already fixed in hadoop-1.2.1 version as in the given below link, https://issues.apache.org/jira/i#browse/MAPREDUCE-5351?issueKey=MAPREDUCE-

Re: Job initialization failed: java.lang.NullPointerException at resolveAndAddToTopology

2013-10-14 Thread Arun C Murthy
Please ask CDH lists. Arun On Oct 11, 2013, at 4:59 AM, fab wol wrote: > Hey everyone, I've got supplied with a decent ten node CDH 4.4 cluster, only > 7 days old, and someone tried some HBase stuff on it. Now I wanted to try > some MR Stuff on it, but starting a Job is already not possible

Re: Hadoop Jobtracker cluster summary of heap size and OOME

2013-10-14 Thread Arun C Murthy
Please don't cross-post. HADOOP_HEAPSIZE of 1024 is too low. You might want to bump it up to 16G or more, depending on: * #jobs * Scheduler you use. Arun On Oct 11, 2013, at 9:58 AM, Viswanathan J wrote: > Hi, > > I'm running a 14 nodes Hadoop cluster with tasktrackers running in all nodes.

Re: Identification of mapper slots

2013-10-14 Thread Rahul Jain
I assume you know the tradeoff here: If you do depend upon mapper slot # in your implementation to speed it up, you are losing on code portability in long term That said, one way to achieve this is to use the JobConf API: int partition = jobConf.getInt(JobContext.TASK_PARTITION, -1); The fra

Identification of mapper slots

2013-10-14 Thread Hider, Sandy
In Hadoop under the mapred-site.conf I can set the maximum number of mappers. For the sake of this email I will call the number of concurrent mappers: mapper slots. Is it possible to figure out from within the mapper which mapper slot it is running in? On this project this is important becau

Re: Yarn killing my Application Master

2013-10-14 Thread Pradeep Gollakota
Thank you so much Jian. While refactoring my code, I accidentally deleted the line that registered my app with the RM. Not sure how I missed it... doh! On Sat, Oct 12, 2013 at 9:14 PM, Jian He wrote: > By looking at the logs, it shows your application was killed and did not > successfully reg

Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-14 Thread Viswanathan J
Thanks a lot and lot Antonio. I'm using the Apache hadoop, hope this issue will be resolved in upcoming apache hadoop releases. Do I need the restart whole cluster after changing the mapred site conf as you mentioned? What is the following bug id, https://issues.apache.org/jira/i#browse/MAPREDU

Re: Hadoop Jobtracker heap size calculation and OOME

2013-10-14 Thread Viswanathan J
Hi guys, Appreciate your response. Thanks, Viswa.J On Oct 12, 2013 11:29 PM, "Viswanathan J" wrote: > Hi Guys, > > But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as > per the hadoop release notes as below. > > Please check this URL, > > https://issues.apache.org/jira/br

Re: Job initialization failed: java.lang.NullPointerException at resolveAndAddToTopology

2013-10-14 Thread fab wol
this looks like it belongs to my problem, right? https://issues.apache.org/jira/browse/MAPREDUCE-50 Cheers Wolli 2013/10/11 DSuiter RDX > It looks like you are correct, and I did not have the right solution, I > apologize. I'm not sure if the other nodes need to be involved either. Now > I'm

Re: Error putting files in the HDFS

2013-10-14 Thread hadoop hive
Hi Indrashish, Can you please check if your you DN is accessible by nn , and the other this is hdfs-site.xml of DN is NN ip is given or not becoz if DN is up and running the issue is DN is not able to attached to NN for getting register. You can add DN in include file as well . thanks Vikas Sriv