[ 
https://issues.apache.org/jira/browse/MAHOUT-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Palumbo updated MAHOUT-2074:
-----------------------------------
    Description: 
Have a [WIP] Dockerfile for which (assuming a binary release,) pulls the 
appropriate version of Spark and places both Spark and Mahout in {{/opt/spark}} 
and {{/opt/mahout respectively.}}

Would like to add full build mahout build capabilities (this should not be 
difficult) in a second file.

these files currently use an ENTRYPOINT["entrypoint.sh"] command and some 
environment variables (none uncommon to Spark or Mahout aside from a 
$MAHOUT_CLASSPATH {{env}} variable).  

the {{entrypiont.sh}} essentially. cheeks to see if the command is form a 
worker or a driver, and runs as such.  Currently I'm just dumping the entire 
{{$MAHOUT_HOME/lib/*.jar}} into the {{$MAHOUT_CLASSPATH}} and adding it to the 
{{SPARK_CLASSPATH}}.

If the {{entrypoint.sh}} file detects a driver. it will launch 
{{spark-submit}}.  IIRC, which I so not think that I do, spark submit can 
handle any driver [~pferrel] Does this sound correct.  Otherwise we just add 
the mahout class to be passed to spark-submit class as a command parameter. 

Though this may be better to migrate in 14.2 or 15.0 to an entire new 
build-chain. E.g. CMake.   (I would suggest) given our large amount of native 
code (hopefully soon to be added :)) 

Though its nearly finished may want to punt this {{Dockerfile}} for 14.1, or 
mark it Experimental, is likely a better option.

  was:
Have a [WIP] Dockerfile for which (assuming a binary release,) pulls the 
appropriate version of Spark and places both Spark and Mahout in 
\{{/opt/spark}} and {{/opt/mahout respectivly.  }}

{{Would like full build capbilities, thouigh this may be better to migrate in 
14.2 or 15.0 to an etrire new buildchain. E.g. CMake.   [I would suggest] given 
aour large amount of native code (hopefully soon to be added :))  }}


> Dockerfile(s) 
> --------------
>
>                 Key: MAHOUT-2074
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-2074
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 14.1
>            Reporter: Andrew Palumbo
>            Assignee: Andrew Palumbo
>            Priority: Critical
>             Fix For: 14.1
>
>
> Have a [WIP] Dockerfile for which (assuming a binary release,) pulls the 
> appropriate version of Spark and places both Spark and Mahout in 
> {{/opt/spark}} and {{/opt/mahout respectively.}}
> Would like to add full build mahout build capabilities (this should not be 
> difficult) in a second file.
> these files currently use an ENTRYPOINT["entrypoint.sh"] command and some 
> environment variables (none uncommon to Spark or Mahout aside from a 
> $MAHOUT_CLASSPATH {{env}} variable).  
> the {{entrypiont.sh}} essentially. cheeks to see if the command is form a 
> worker or a driver, and runs as such.  Currently I'm just dumping the entire 
> {{$MAHOUT_HOME/lib/*.jar}} into the {{$MAHOUT_CLASSPATH}} and adding it to 
> the {{SPARK_CLASSPATH}}.
> If the {{entrypoint.sh}} file detects a driver. it will launch 
> {{spark-submit}}.  IIRC, which I so not think that I do, spark submit can 
> handle any driver [~pferrel] Does this sound correct.  Otherwise we just add 
> the mahout class to be passed to spark-submit class as a command parameter. 
> Though this may be better to migrate in 14.2 or 15.0 to an entire new 
> build-chain. E.g. CMake.   (I would suggest) given our large amount of native 
> code (hopefully soon to be added :)) 
> Though its nearly finished may want to punt this {{Dockerfile}} for 14.1, or 
> mark it Experimental, is likely a better option.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to