Re: Service Level Authorization

2014-02-21 Thread Juan Carlos
Thanks Alex, my path to the queue was a mistake when I was testing configurations and was unable to make work ACLs. My major problem was about mapreduce.cluster.administrators parameters. I didn't know anything about this parameter, I have been looking for it in

Datanodes going out of reach in hadoop

2014-02-21 Thread Yogini Gulkotwar
Hello, I am working with a 5 node hadoop cluster. The hdfs is on a shared NFS directory of 98TB. So when we view the namenode UI, the following is displayed: *Node* *Last Contact* *Admin State* *ConfiguredCapacity (TB)* *Used(TB)* *Non DFS Used (TB)* *Remaining(TB)* *Used(%)* *Remaining(%)*

JobHistoryEventHandler failed with AvroTypeException.

2014-02-21 Thread Rohith Sharma K S
Hi all, I am using Hadoop-2.3 for Yarn Cluster. While running job, I encountered below exception in MRAppmaster. Why this error is logging? 2014-02-21 22:10:33,841 INFO [Thread-355] org.apache.hadoop.service.AbstractService: Service JobHistoryEventHandler failed in state STOPPED; cause:

Path filters for multiple inputs

2014-02-21 Thread AnilKumar B
Hi, May I know, how can I apply path filters for multiple inputs? I mean for each multiple input I need to apply different filters. Is it possible? I tried to set my own PathFilter to FileInputFormat, but it is applying for all multiple inputs. And also how can I ignore the sub directories in

Cleanup after Yarn Job

2014-02-21 Thread Brian C. Huffman
All, I'm trying to model a Yarn Client after the Distributed Shell example. However I'd like to add a method to cleanup the job's files after completion. I've defined a cleanup routine: private void cleanup(ApplicationId appId, FileSystem fs) throws IOException { String

[no subject]

2014-02-21 Thread Aaron Zimmerman
The worker nodes on my version 2.2 cluster won't use more than 11 of the 30 total (24 allocated) for mapreduce jobs running in Yarn. Does anyone have an idea what might be constraining the usage of Ram? I followed the steps listed here:

Re: Capacity Scheduler capacity vs. maximum-capacity

2014-02-21 Thread ricky l
Does Hadoop capacity scheduler support preemption in this scenario? Based on what Vinod says, the preemption seems to be supported by configuration. If so, can someone point me an instruction to do that? The preemption would really be helpful for my use-case. thanks. On Fri, Feb 21, 2014 at 12:39

Questions from a newbie to Hadoop

2014-02-21 Thread Publius
 Hello I am new to hadoop and trying to learn it I want to set up Psuedo distributed mode on my windows vista (32 bit) machine to experiment with? I am having great difficulty locating the correct software(s) to do this? JDK1.6 or JDK1.7, exclipse? oracle VM box, VMplayer, etc CH4,

Re: Questions from a newbie to Hadoop

2014-02-21 Thread Arpit Agarwal
You can try building Apache Hadoop with these instructions: https://wiki.apache.org/hadoop/Hadoop2OnWindows 32-bit Windows has not been tested. -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain

Re: Questions from a newbie to Hadoop

2014-02-21 Thread Devin Suiter RDX
You should also clarify for the group: Do you want to make a virtual machine to run a pseudo-distributed Hadoop cluster on? Or Do you want to install Hadoop directly onto the Vista machine and run it there? If the former, you should be able to set up a VM just fine with a Linux version of your

Re: No job shown in Hadoop resource manager web UI when running jobs in the cluster

2014-02-21 Thread Zhijie Shen
Hi Richard, Not sure how NPE happened on you command line, but I'd like to clarify something here: 1. If you want to see mapreduce jobs, please use mapred job. hadoop job is deprecated. If you want to see all kinds of applications run by your YARN cluster, please use yarn application. 2. Job

Re: Questions from a newbie to Hadoop

2014-02-21 Thread Publius
I wish to run pseudo machine on a virtual box I have it almost running on oracle virtual box with Hortonworks sandbox 2.0,  but the virtual appliance is wanting a 64 bit CPU, and mine is only 32 bit still looking for a 32 bit version of Hortonworks sand box 2.0 Hortonworks seems to be very

Re: A question about Hadoop 1 job user id used for group mapping, which could lead to performance degradatioin

2014-02-21 Thread Chris Schneider
Hi John, FWIW, setting the log level of org.apache.hadoop.security.UserGroupInformation to ERROR seemed to prevent the fatal NameNode slowdown we ran into. Although I still saw no such user Shell$ExitCodeException messages in the logs, these only occurred every few minutes or so. Thus, it

Having trouble adding external JAR to MapReduce Program

2014-02-21 Thread Jonathan Poon
Hi Everyone, I'm running into trouble adding the Avro JAR into my MapReduce program. I do the following to try to add the Avro JAR: export

RE: Having trouble adding external JAR to MapReduce Program

2014-02-21 Thread Gaurav Gupta
Jonathan, You have to make sure that the jar is available on the nodes where the map reduce job is running. Setting the HADOOP_CLASSPATH on the single node doesn't work. You can use -libjars to the hadoop command line. Thanks Gaurav From: Jonathan Poon [mailto:jkp...@ucdavis.edu]

Re: Having trouble adding external JAR to MapReduce Program

2014-02-21 Thread Azuryy Yu
Hi, you cannot add jar like this way. please look at DistributeCache in the Hadoop Java Doc. please call DistributeCache.addArchive() in your main Class before submit the MR job. On Sat, Feb 22, 2014 at 9:30 AM, Gaurav Gupta gau...@datatorrent.comwrote: Jonathan, You have to make sure

Re: Having trouble adding external JAR to MapReduce Program

2014-02-21 Thread Jonathan Poon
Thanks for the suggestions! I will give them a look and try to see it it will work! Jonathan On Fri, Feb 21, 2014 at 6:35 PM, Azuryy Yu azury...@gmail.com wrote: Hi, you cannot add jar like this way. please look at DistributeCache in the Hadoop Java Doc. please call