Hadoop 2.6.0 Multi Node Setup

2015-01-27 Thread Telles Nobrega
Hi, I'm starting to deply Hadoop 2.6.0 multi node. My first question is: In the documenation page, it says that the configuration files are under conf/ but I found them in etc/. Should I move them to conf or is this just out of date information? My second question is regarding users permission, I

Re: Hadoop 2.6.0 Multi Node Setup

2015-01-27 Thread Ahmed Ossama
Hi Telles, No, the documentation isn't out of date. Normally hadoop configuration files are placed under /etc/hadoop/conf, it then referenced to when starting the cluster with --config $HADOOP_CONF_DIR, this is how hdfs and yarn know their configuration. Second, it's not a good practice to

Re: Hadoop 2.6.0 Multi Node Setup

2015-01-27 Thread Telles Nobrega
Thanks. On Tue Jan 27 2015 at 15:59:35 Ahmed Ossama ah...@aossama.com wrote: Hi Telles, No, the documentation isn't out of date. Normally hadoop configuration files are placed under /etc/hadoop/conf, it then referenced to when starting the cluster with --config $HADOOP_CONF_DIR, this is

Re: MapReduce job is not picking up appended data.

2015-01-27 Thread Azuryy Yu
Are you sure you can 'cat' the lastest batch of the data on HDFS? for Flume, the data is available only after file rolled, because Flume only call FileSystem.close() during file rolling. On Mon, Jan 26, 2015 at 8:17 PM, Uthayan Suthakar uthayan.sutha...@gmail.com wrote: I have a Flume which

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-27 Thread Daniel Haviv
Can you elaborate on why you prefer Tajo? Daniel On 27 בינו׳ 2015, at 10:35, Azuryy Yu azury...@gmail.com wrote: You almost list all open sourced MPP real time SQL-ON-Hadoop. I prefer Tajo, which was relased by 0.9.0 recently, and still working in progress for 1.0 On Mon, Jan 26,

Re: Which [open-souce] SQL engine atop Hadoop?

2015-01-27 Thread Azuryy Yu
You almost list all open sourced MPP real time SQL-ON-Hadoop. I prefer Tajo, which was relased by 0.9.0 recently, and still working in progress for 1.0 On Mon, Jan 26, 2015 at 10:19 PM, Samuel Marks samuelma...@gmail.com wrote: Since Hadoop https://hive.apache.org came out, there have been

Re: Hadoop 2.6.0 Multi Node Setup

2015-01-27 Thread Ahmed Ossama
Make sure that all nodes can resolve each other. You can do this by simply modifying /etc/hosts on each node with the IPs of the cluster Then add them to your /etc/hadoop/slaves file. On 01/27/2015 10:58 PM, Telles Nobrega wrote: I was able to start some services, but Yarn is failing with

Re: NN questions

2015-01-27 Thread Ahmed Ossama
Writes are done in parallel. Yes, dfs.namenode.shared.edits.dir support multiple directories. On 01/26/2015 10:16 PM, dlmar...@comcast.net wrote: If multiple directories are specified for dfs.namenode.name.dir and dfs.namenode.edits.dir, are the writes to the different directories done in

Re: Hadoop 2.6.0 Multi Node Setup

2015-01-27 Thread Telles Nobrega
I was able to start some services, but Yarn is failing with org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.net.SocketException: Unresolved address; Host Details : local host is: telles-hadoop-two; destination host is: (unknown):0. Just

Re: Reliability of timestamps in logs

2015-01-27 Thread Ravi Prakash
I'm afraid I don't know what the SLS is. Obviously it shouldn't matter if it runs on the same node. I don't think hadoop code ever updates the system clock. In fact it shouldn't even be run with the perms to do so.It depends on log4j appenders whether they buffer and batch the messages before

After cluster rolling reboot, nodemanager could not authenticate to resource manager

2015-01-27 Thread Manoj Samel
Environment is Hadoop 2.3.0, CDH 5.0, RM and NN in HA, Kerberos Security Rolling reboot of cluster was done. Services on each node was not stopped before, the machines were just shut down, rebooted and services started on each after reboot. Nodes were shut down in rolling manner such that one

Re: yarn jobhistory server not displaying all jobs

2015-01-27 Thread Matt K
Thanks Ravi! This helps. On Mon, Jan 26, 2015 at 2:22 PM, Ravi Prakash ravi...@ymail.com wrote: Hi Matt! Take a look at the mapreduce.jobhistory.* configuration parameters here for the delay in moving finished jobs to the HistoryServer:

yarn cache settings

2015-01-27 Thread hitarth trivedi
Hi, We have yarn.nodemanager.local-dirs set to /var/lib/hadoop/tmp/nm-local-dir. This is the directory where the mapreduce jobs store temporary data. On restart of nodemanager, the contents of the directory are deleted. I see the following definitions for

Re: Hadoop Security Community

2015-01-27 Thread Ashish Kumar9
I would also be interested . From: Chris MacKenzie stu...@chrismackenziephotography.co.uk To: user@hadoop.apache.org Date: 01/27/2015 03:46 PM Subject:Re: Hadoop Security Community Hi Adam, I wold also love to be involved in this. Regards, Chris MacKenzie telephone: 0131

Re: Reliability of timestamps in logs

2015-01-27 Thread Fabio
Thanks for the diagrams tip! I will try it. (SLS is the Scheduler Load Simulator http://hadoop.apache.org/docs/r2.6.0/hadoop-sls/SchedulerLoadSimulator.html ) Regards Fabio On 01/27/2015 11:04 PM, Ravi Prakash wrote: I'm afraid I don't know what the SLS is. Obviously it shouldn't matter if

Re: Question about YARN Memory allocation

2015-01-27 Thread 임정택
Forgot to add one thing, all memory (120G) is reserved now. Apps SubmittedApps PendingApps RunningApps CompletedContainers RunningMemory UsedMemory TotalMemory ReservedVCores UsedVCores TotalVCores ReservedActive NodesDecommissioned NodesLost NodesUnhealthy NodesRebooted Nodes211060120 GB120 GB20

Question about YARN Memory allocation

2015-01-27 Thread 임정택
Hello all! I'm new to YARN, so it could be beginner question. (I've been used MRv1 and changed just now.) I'm using HBase with 3 masters and 10 slaves - CDH 5.2 (Hadoop 2.5.0). In order to migrate MRv1 to YARN, I read several docs, and change configrations. ```

cannot create files in hdfs when -put command issued on a datanode which is in exclude list

2015-01-27 Thread Rainer Toebbicke
Hello, I ran into what a weird problem creating files and for the minute I only have a shaky conclusion: logged in as a vanilla user on a datanode the simple command hdfs dfs -put /etc/motd motd reproducibly bails out with WARN hdfs.DFSClient: DataStreamer Exception

Re: MapReduce job is not picking up appended data.

2015-01-27 Thread Uthayan Suthakar
Azuryy, I'm pretty sure that I could 'cat'. Please see below for the evidence: (1) Flume.conf: a1.sinks.k1.hdfs.rollInterval=3600 a1.sinks.k1.hdfs.batchSize = 10 I sent 21 events and I could 'cat' and verify this: $ hdfs dfs -cat

Re: Time until a datanode is marked as dead

2015-01-27 Thread Frank Lanitz
Am 26.01.2015 um 10:46 schrieb Azuryy Yu: can you file an issue to add this configuration to the hdfs-default.xml? Done with https://issues.apache.org/jira/browse/HDFS-7685 Cheers, Frank

Re: Hadoop Security Community

2015-01-27 Thread daemeon reiydelle
Add me to the list of interested parties. I am heavily involved with security and controls of Hadoop, Google MR, use of OSSEC on Hadoop clusters, etc. *...* *“Life should not be a journey to the grave with the intention of arriving safely in apretty and well preserved body, but rather

Re: After cluster rolling reboot, nodemanager could not authenticate to resource manager

2015-01-27 Thread daemeon reiydelle
Check your ip addresses and host names of the RM (could be an issue around which interface the nodes are now using?) *...* *“Life should not be a journey to the grave with the intention of arriving safely in apretty and well preserved body, but rather to skid in broadside in a cloud

Re: yarn cache settings

2015-01-27 Thread daemeon reiydelle
If you are running /var/lib/hadoop/tmp dir in the / file system, you may want to reconsider that. Disk IO will cause issues with the OS as it attempts to use it's file system. *...* *“Life should not be a journey to the grave with the intention of arriving safely in apretty and well