[ 
https://issues.apache.org/jira/browse/HADOOP-16091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16833011#comment-16833011
 ] 

Elek, Marton commented on HADOOP-16091:
---------------------------------------

Thank you very much to upload this patch [~eyang]. It's always easier to 
discuss about real code.

1. First of all, please move this to a HDDS jira. It seems that the issue is 
changed to change HDDS and I think it's better to handle it under the HDDS 
project.

2. I am +1 to use assembly plugin instead of the shell based tar.

3. I have some problems with the inline docker image creation. 

# The discussion is started in the mailing list (and I don't think it's 
finished, my latest concerns are written here: 
https://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201903.mbox/%3C5bfeb864-3f26-1ccc-3300-2680e1b94f34%40apache.org%3E)
# It's also under discussion in HDDS-1458 and I don't think that we have 
consensus. 
 
I would prefer to agree in the approach first and discuss it at one location to 
make it easier to follow for everybody.

4. It seems that we have exactly the same functionality in hadoop-ozone/dist 
k8s-dev and k8s-dev-push profiles. Do you have any suggestion how the 
duplication can be handled?

AFAIK in this approach (Fix me please, if I am wrong):

 * We create a final ozone folder
 * We create a tar file (~360MB)
 * We copy it to the local maven repository (~360MB) 
 * We copy it to the docker/target directory (~360MB)
 * We create the docker image (~360MB)

In the k8s-dev profile:

 * We create a final ozone folder
 * We create the docker image (~360MB)

But the results are the same. It seems to be more effective for me and the tar 
step can be optional. And for a normal build we don't need it at all just for 
kubernetes deployment.

bq. /opt/ozone-${project.version}

Did you execute the smoketest? Did you test it from hive/spark? I am afraid 
that we will have some problems when we do testing with ozonefs/spark/hive as 
we need to know the exact location of the jar file. I think it's better to have 
the location version independent, but it's not a strong preference.

bq. Another notable problem is the hadoop-runner image is built with Squash and 
symlinks are not supported, and move of directory location is also not 
supported during build process. It is probably better to pick centos as base 
image to avoid those limitations with squashfs based image.

Can you please give me more information as I don't understand what is the 
problem exactly. Where do we need symlinks.

Independent from the answer I have no problem with using centos. I believe that 
we use centos even now:

{code}
docker run apache/hadoop-runner cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core) 
{code}


> Create hadoop/ozone docker images with inline build process
> -----------------------------------------------------------
>
>                 Key: HADOOP-16091
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16091
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: Elek, Marton
>            Assignee: Eric Yang
>            Priority: Major
>         Attachments: HADOOP-16091.001.patch, HADOOP-16091.002.patch
>
>
> This is proposed by [~eyang] in 
> [this|https://lists.apache.org/thread.html/33ac54bdeacb4beb023ebd452464603aaffa095bd104cb43c22f484e@%3Chdfs-dev.hadoop.apache.org%3E]
>  mailing thread.
> {quote}1, 3. There are 38 Apache projects hosting docker images on Docker hub 
> using Apache Organization. By browsing Apache github mirror. There are only 7 
> projects using a separate repository for docker image build. Popular projects 
> official images are not from Apache organization, such as zookeeper, tomcat, 
> httpd. We may not disrupt what other Apache projects are doing, but it looks 
> like inline build process is widely employed by majority of projects such as 
> Nifi, Brooklyn, thrift, karaf, syncope and others. The situation seems a bit 
> chaotic for Apache as a whole. However, Hadoop community can decide what is 
> best for Hadoop. My preference is to remove ozone from source tree naming, if 
> Ozone is intended to be subproject of Hadoop for long period of time. This 
> enables Hadoop community to host docker images for various subproject without 
> having to check out several source tree to trigger a grand build. However, 
> inline build process seems more popular than separated process. Hence, I 
> highly recommend making docker build inline if possible.
> {quote}
> The main challenges are also discussed in the thread:
> {code:java}
> 3. Technically it would be possible to add the Dockerfile to the source
> tree and publish the docker image together with the release by the
> release manager but it's also problematic:
> {code}
> a) there is no easy way to stage the images for the vote
>  c) it couldn't be flagged as automated on dockerhub
>  d) It couldn't support the critical updates.
>  * Updating existing images (for example in case of an ssl bug, rebuild
>  all the existing images with exactly the same payload but updated base
>  image/os environment)
>  * Creating image for older releases (We would like to provide images,
>  for hadoop 2.6/2.7/2.7/2.8/2.9. Especially for doing automatic testing
>  with different versions).
> {code:java}
>  {code}
> The a) can be solved (as [~eyang] suggested) with using a personal docker 
> image during the vote and publish it to the dockerhub after the vote (in case 
> the permission can be set by the INFRA)
> Note: based on LEGAL-270 and linked discussion both approaches (inline build 
> process / external build process) are compatible with the apache release.
> Note: HDDS-851 and HADOOP-14898 contains more information about these 
> problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to