Elek, Marton created HDDS-829:
---------------------------------

             Summary: Support cloud native Ozone deployment
                 Key: HDDS-829
                 URL: https://issues.apache.org/jira/browse/HDDS-829
             Project: Hadoop Distributed Data Store
          Issue Type: Improvement
            Reporter: Elek, Marton
            Assignee: Elek, Marton


I tested Ozone on kubernetes cluster, because:

 * It makes easy to do tests which are scaled up and down (for example test 500 
nodes cluster)
 * It makes very fast to deploy/undeploy newer versions
 * It makes very easy to run the same cluster in multiple cloud environment.
 * Long term it also could provide persistence store for any container/pod in 
the k8s cluster

Kubernetes also very popular and could be useful to provide an official 
guidance how ozone could be started and started on k8s. 

As of now we use base image from the flokkr project (chats: 
http://github.com/flokkr/charts, base image: 
http://github.com/flokkr/docker-hadoop)

In this issue I would like to collect the required steps to provide a simple 
way to start Ozone cluster on kubernetes.

What we need:

1. official apache/ozone:0.2.1, apache/ozone:0.3.0 docker images. (This is not 
a big task as we have same for the apache/hadoop images. There is an existing 
pattern which could be followed)

2. Create helm chart. (I have an example helm chart. It easier with the latest 
scm/om initialization improvement, where the 'om --init' call could be executed 
even if we have the files. We can make the chart be part of the official 
helm/stable repository sooner or later)

3. The biggest difference between the flokkr and apache base images that flokkr 
contains logic to instrument the java processes for prometheus.

This part downloads a jar file (based on https://github.com/elek/jmx_exporter) 
and uses it as a java agent. This is a fork of the original jmx_exporter which 
doesn't support dynamic jmx attachment.

I believe that we need a simple monitoring solution to show the key numbers of 
a default install but this approach may be to complex to adopt in the apache 
base image.

As prometheus use a very simple text format to publish the metrics I propose to 
create a new simple servlet to publish all hadoop metrics. It should be similar 
to this one: 

https://github.com/prometheus/client_java/blob/master/simpleclient_common/src/main/java/io/prometheus/client/exporter/common/TextFormat.java

But based on Hadoop metrics.

Small independent change, could be done without any additional dependencies.

4. Other minor feature of the flokkr base image is the WAITFOR environment 
variable which could be used to wait for a dependant service. (Could be used as 
a workaround -- for example  -- until HDDS-776 is implemented). Can be handled 
with 8 lines of bash code


After 1-4.) we can add documentation how ozone could be started in any k8s 
cluster.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to