A custom implementation would have to be developed using some container
orchestration service such as Kubernetes. Create a cluster of Pods (container
sets) with different daemons running in different Pods and scale the Pods. For
example, start ResourceManager on one instance and NodeManager on
The cloudera/quickstart is the Docker image for Hadoop.
https://hub.docker.com/r/cloudera/quickstart/
Also refer,
http://www.cloudera.com/documentation/enterprise/5-6-x/topics/quickstart_docker_container.html
I'd like to deploy YARN by Kubernetes.
I built docker images with Apache Hadoop, and I'd like to contribute it into
hadoop source if not. It'll be great if Hadoop have an official place for those
images.
Da (Klaus), Ma (??), PMP(r)| Software Architect
Platform DCOS Development &
On Mon, Jul 18, 2016 at 5:34 PM, Klaus Ma wrote:
> Hi team,
>
>
> Does anyone know where's official docker images? If not, I'd like to
> contribute the Dockefile for it.
I am just curious, what's your use case?
Also, you may want to look at the following "prior art" in
Hi team,
Does anyone know where's official docker images? If not, I'd like to contribute
the Dockefile for it.
BTW, do we have official docker hub account for Hadoop?
If any suggestion, please let me know.
Da (Klaus), Ma (??), PMP®| Software Architect
Platform DCOS Development &
Welcome to the community Richard!
I suspect Hadoop can be more useful than just splitting and stitching back
data. Depending on your use cases, it may come in handy to manage your
machines, restart failed tasks, scheduling work when data becomes available
etc. I wouldn't necessarily count it out.
Hi Vinodh,
Are there any spaces in your JAVA_HOME path? If so you need to use the short
(8.3) path. E.g. c:\progra~1\java (assuming you haven’t done so already).
From: Rakesh Radhakrishnan
Date: Sunday, July 17, 2016 at 11:03 PM
To: Vinodh Nagaraj
Hello Michael,
Historically, there has never been a firm requirement that clients must call
FileSystem#close upon finishing usage of an instance. I think the history here
is that the close method was not part of the initial API definition, and when
it was added, there were already a lot of
I think you're confused as to what these things are.
The fundamental question is do you want to run one job on sub parts of the
data, then stitch their results together (in which case
hive/map-reduce/spark will be for you), or do you essentially already have
splitting to computer-sized chunks
Hello,
I wonder if the community can help me get started.
I’m trying to design the architecture of a project and I think that using some
Apache Hadoop technologies may make sense, but I am completely new to
distributed systems and to Apache (I am a very experienced developer, but my
Sandeep,
Can you please share more information on which hadoop version you are using
and also size of the cluster in terms of fsimage size or file/block count.
Also what is the threshold set for rpc latency?
There is very less probability that standbyNN getting rpc latency unless
there is a
>>>I couldn't find folder* conf in *hadoop home.
Could you check %HADOOP_HOME%/etc/hadoop/hadoop-env.cmd path. May be,
U:/Desktop/hadoop-2.7.2/etc/hadoop/hadoop-env.cmd location.
Typically HADOOP_CONF_DIR will be set to %HADOOP_HOME%/etc/hadoop. Could
you check "HADOOP_CONF_DIR" env variable
12 matches
Mail list logo