[
https://issues.apache.org/jira/browse/MAHOUT-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990175#comment-16990175
]
Andrew Palumbo commented on MAHOUT-2074:
hey [@Joe Olson|https://mahout.slack.com/team/UNU4UU745] -- i started mesing
around breaking the Dockerfiles apart last night. they're far from complete,
but you might want to use these:
!https://slack-imgs.com/?c=1=wi32.he32.si=https%3A%2F%2Fa.slack-edge.com%2F80588%2Fimg%2Funfurl_icons%2Fgithub.png|width=16,height=16!GitHub
[andrewpalumbo/mahout|https://github.com/andrewpalumbo/mahout/blob/dockerfiles/Dockerfile.spark]
[https://github.com/andrewpalumbo/mahout/blob/dockerfiles/Dockerfile.spark|https://slack-redir.net/link?url=https%3A%2F%2Fgithub.com%2Fandrewpalumbo%2Fmahout%2Fblob%2Fdockerfiles%2FDockerfile.spark=3]
!https://slack-imgs.com/?c=1=wi32.he32.si=https%3A%2F%2Fa.slack-edge.com%2F80588%2Fimg%2Funfurl_icons%2Fgithub.png|width=16,height=16!GitHub
[andrewpalumbo/mahout|https://github.com/andrewpalumbo/mahout/blob/dockerfiles/Dockerfile]
[https://github.com/andrewpalumbo/mahout/blob/dockerfiles/Dockerfile|https://slack-redir.net/link?url=https%3A%2F%2Fgithub.com%2Fandrewpalumbo%2Fmahout%2Fblob%2Fdockerfiles%2FDockerfile=3]
again, I really havent done much work on them, just started a basic file. then
ripped spark out of one.
thx
> Dockerfile(s)
> --
>
> Key: MAHOUT-2074
> URL: https://issues.apache.org/jira/browse/MAHOUT-2074
> Project: Mahout
> Issue Type: New Feature
>Affects Versions: 14.1
>Reporter: Andrew Palumbo
>Assignee: Joe Olson
>Priority: Critical
> Fix For: 14.1
>
>
> Have a [WIP] Dockerfile for which (assuming a binary release,) pulls the
> appropriate version of Spark and places both Spark and Mahout in
> {{/opt/spark}} and {{/opt/mahout respectively.}}
> Would like to add full build mahout build capabilities (this should not be
> difficult) in a second file.
> these files currently use an ENTRYPOINT["entrypoint.sh"] command and some
> environment variables (none uncommon to Spark or Mahout aside from a
> $MAHOUT_CLASSPATH {{env}} variable).
> the {{entrypiont.sh}} essentially. cheeks to see if the command is form a
> worker or a driver, and runs as such. Currently I'm just dumping the entire
> {{$MAHOUT_HOME/lib/*.jar}} into the {{$MAHOUT_CLASSPATH}} and adding it to
> the {{SPARK_CLASSPATH}}.
> If the {{entrypoint.sh}} file detects a driver. it will launch
> {{spark-submit}}. IIRC, which I so not think that I do, spark submit can
> handle any driver [~pferrel] Does this sound correct. Otherwise we just add
> the mahout class to be passed to spark-submit class as a command parameter.
> Though this may be better to migrate in 14.2 or 15.0 to an entire new
> build-chain. E.g. CMake. (I would suggest) given our large amount of native
> code (hopefully soon to be added :))
> Though its nearly finished may want to punt this {{Dockerfile}} for 14.1, or
> mark it Experimental, is likely a better option.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)