[
https://issues.apache.org/jira/browse/HADOOP-13397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017427#comment-16017427
]
Elek, Marton commented on HADOOP-13397:
---------------------------------------
I am interested about the docker images as I plan to create additional getting
started tutorials based on docker images.
I tested mkhdf and it worked well. I also have experiences woth my own docker
images: I am running hadoop/spark/hbase and other clusters with docker images
where every service is in a separated container. (see
http://github.com/elek/bigdata-docker/ if you interested)
I suggest to split this jira as (as I see) there are two parts:
1. one side is the role of mkhdf: which could create a selfcontained customized
Dockerfile according to the parameters
2. I think, it's a separated task to create (or generate with mkhdf) one exact
Dockerfile, commit it to a new branch in the hadoop git repository and ask
INFRA to register new branch to the dockerhub.
My proposal to the second one is here:
https://github.com/elek/hadoop/tree/docker-2.8.0
The example to use is here:
https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml
As you can see everything could be configured with environment variables, thx
to a simple script which converts the environment variables to hadoop xml (and
other property) format.
I would be happy to contribute this type of configuration loading to the mkhdf
as a separated module. But as I wrote, I think it two things and with creating
two separated jira, I think we can create apache/hadoop images even without
blocking on the mkhdf script.
> Add dockerfile for Hadoop
> -------------------------
>
> Key: HADOOP-13397
> URL: https://issues.apache.org/jira/browse/HADOOP-13397
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Klaus Ma
> Assignee: Allen Wittenauer
> Attachments: HADOOP-13397.DNC001.patch
>
>
> For now, there's no community version Dockerfile in Hadoop; most of docker
> images are provided by vendor, e.g.
> 1. Cloudera's image: https://hub.docker.com/r/cloudera/quickstart/
> 2. From HortonWorks sequenceiq:
> https://hub.docker.com/r/sequenceiq/hadoop-docker/
> 3. MapR provides the mapr-sandbox-base:
> https://hub.docker.com/r/maprtech/mapr-sandbox-base/
> The proposal of this JIRA is to provide a community version Dockerfile in
> Hadoop, and here's some requirement:
> 1. Seperated docker image for master & agents, e.g. resource manager & node
> manager
> 2. Default configuration to start master & agent instead of configurating
> manually
> 3. Start Hadoop process as no-daemon
> Here's my dockerfile to start master/agent:
> https://github.com/k82cn/outrider/tree/master/kubernetes/imgs/yarn
> I'd like to contribute it after polishing :).
> Email Thread :
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201607.mbox/%3CSG2PR04MB162977CFE150444FA022510FB6370%40SG2PR04MB1629.apcprd04.prod.outlook.com%3E
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]