Re: About Archival Storage

2016-07-19 Thread kevin
Thanks again. "automatically" what I mean is the hdfs mover knows the hot data have come to cold , I don't need to tell it what exactly files/dirs need to be move now ? Of course I should tell it what files/dirs need to monitoring. 2016-07-20 12:35 GMT+08:00 Rakesh Radhakrishnan

Re: About Archival Storage

2016-07-19 Thread Rakesh Radhakrishnan
>>>I have another question is , hdfs mover (A New Data Migration Tool ) know when to move data from hot to cold automatically ? While running the tool, it reads the argument and get the separated list of hdfs files/dirs to migrate. Then it periodically scans these files in HDFS to check if the

Re: ZKFC fencing problem after the active node crash

2016-07-19 Thread Rakesh Radhakrishnan
Hi Alexandr, Since you powered off the Active NN machine, during fail-over SNN timed out to connect to this machine and fencing is failed. Typically fencing methods should be configured to not to allow multiple writers to same shared storage. It looks like you are using 'QJM' and it supports the

Re: About Archival Storage

2016-07-19 Thread kevin
Thanks a lot Rakesh. I have another question is , hdfs mover (A New Data Migration Tool ) know when to move data from hot to cold automatically ? It use algorithm like LRU、LFU ? 2016-07-19 19:55 GMT+08:00 Rakesh Radhakrishnan : > Is that mean I should config

Re: Where's official Docker image for Hadoop?

2016-07-19 Thread Klaus Ma
HI Deepak, This image still need to manually configure which did not meet the requirement. And I’d suggest Hadoop community provide a set of Dockerfile as example instead of vendor. And where’s the dockerfile in source code? Here’s the output of 2.7.2. Klauss-MacBook-Pro:hadoop-2.7.2-src

ZKFC fencing problem after the active node crash

2016-07-19 Thread Alexandr Porunov
Hello, I have configured Hadoop HA cluster. It works like in tutorials. If I kill Namenode process with command "kill -9 NameNodeProcessId" my standby node changes its state to active. But if I power off active node then standby node can't change its state to active because it trys to connect to

Re: ZKFC do not work in Hadoop HA

2016-07-19 Thread Rakesh Radhakrishnan
Good to hear the problem is resolved and able to continue. Regards, Rakesh On Tue, Jul 19, 2016 at 10:31 PM, Alexandr Porunov < alexandr.poru...@gmail.com> wrote: > Rakesh, > > Thank you very much. I missed it. I hadn't "fuser" command on my nodes. > I've just installed it. ZKFC became work

Re: ZKFC do not work in Hadoop HA

2016-07-19 Thread Alexandr Porunov
Rakesh, Thank you very much. I missed it. I hadn't "fuser" command on my nodes. I've just installed it. ZKFC became work properly! Best regards, Alexandr On Tue, Jul 19, 2016 at 5:29 PM, Rakesh Radhakrishnan wrote: > Hi Alexandr, > > I could see the following warning

Re: ZKFC do not work in Hadoop HA

2016-07-19 Thread Rakesh Radhakrishnan
Hi Alexandr, I could see the following warning message in your logs and is the reason for unsuccessful fencing. Could you please check 'fuser' command execution in your system. 2016-07-19 14:43:23,705 WARN org.apache.hadoop.ha.SshFenceByTcpPort: PATH=$PATH:/sbin:/usr/sbin fuser -v -k -n tcp 8020

ZKFC do not work in Hadoop HA

2016-07-19 Thread Alexandr Porunov
Hello, I have a problem with ZKFC. I have configured High Availability for Hadoop with QJM. The problem is that when I turn off the active master node (or kill the namenode process) standby node does not want to change its status from standby to active. So it continues to be the standby node. I

Re: Where's official Docker image for Hadoop?

2016-07-19 Thread Deepak Vohra
Even the Hadoop documentation refers to the HortonWorks Docker image sequenceiq/hadoop-docker. https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/DockerContainerExecutor.html Apache Hadoop develops the Hadoop software, not related technologies such as Docker image. But a Docker

Re: Where's official Docker image for Hadoop?

2016-07-19 Thread Klaus Ma
I means community version; those docker image are provided by vendors. —— Da (Klaus) Ma (马达), PMP® | Software Architect IBM Spectrum, STG, IBM GCG +86-10-8245 4084 | klaus1982...@gmail.com | http://k82.me On Jul 19, 2016, at 21:33, Deepak Vohra

Re: Where's official Docker image for Hadoop?

2016-07-19 Thread Deepak Vohra
What is meant by official Hadoop image? Hadoop has several distributions such as HortonWorks, Cloudera and MapR and they do provide a Docker image. 1. Official image from Cloudera is the quickstart image. https://hub.docker.com/r/cloudera/quickstart/ 2. From HortonWorks sequenceiq

Re: Building a distributed system

2016-07-19 Thread Marcin Tustin
Perhaps there is. Note that there are a bunch of java job queues. HDFS sounds like it might be a nice way to share the data. Yarn or Mesos might be a nice way to schedule the running of the jobs, but it sounds like you could use any orchestration system to run the worker processes, and just have

Re: Building a distributed system

2016-07-19 Thread Richard Whitehead
Thanks Mirko, I simply don’t understand enough to get going with this. The documentation dives into details so fast but I don’t really understand the basics of using Hadoop. I want to run a process that takes a file as input and gives a file as output, that seems pretty straightforward but I

Re: Where's official Docker image for Hadoop?

2016-07-19 Thread Klaus Ma
Hi Tsuyoshi, I have a set of dockerfile at https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn to for Apache YARN/HDFS; and I’d like to contribute it to upstream if possible. Would you also keep me in the loop if any discussion? Thanks Klaus On Jul 19, 2016, at 15:32,

Re: About Archival Storage

2016-07-19 Thread Rakesh Radhakrishnan
Is that mean I should config dfs.replication with 1 ? if more than one I should not use *Lazy_Persist* policies ? The idea of Lazy_Persist policy is, while writing blocks, one replica will be placed in memory first and then it is lazily persisted into DISK. It doesn't means that, you are

About Archival Storage

2016-07-19 Thread kevin
I don't quite understand :"Note that the Lazy_Persist policy is useful only for single replica blocks. For blocks with more than one replicas, all the replicas will be written to DISK since writing only one of the replicas to RAM_DISK does not improve the overall performance." Is that mean I

Re: Building a distributed system

2016-07-19 Thread Richard Whitehead
Thanks Ravi and Marcin, You are right, what we need is a work queue, a way to start jobs on remote machines, and a way to move data to and from those remote machines. The “jobs” are just executables that process one item of data. We don’t need to split the data into chunks or to combine the

WebUI's Server don't work and JobHistoryServer missing

2016-07-19 Thread Mike Wenzel
My cluster looks like: Node1 - NameNode + ResourceManager Node2 - SecondaryNameNode Node3 - DataNode (+NodeManager) Node4 - DataNode (+NodeManager) Node5 - DataNode (+NodeManager) http://node1:8088/cluster works. My problems: > SecondaryNamenode WebUI: http://node2:50090 doesn't work

Re: Where's official Docker image for Hadoop?

2016-07-19 Thread Tsuyoshi Ozawa
Hi Klaus, Thanks for telling us the request. Currently, the official docker image of Apache Hadoop is not available as Roman mentioned. I will raise this request as discussion. Thanks, - Tsuyoshi On Tue, Jul 19, 2016 at 12:29 PM, Deepak Vohra wrote: > A custom