[
https://issues.apache.org/jira/browse/SUBMARINE-308?focusedWorklogId=368455&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-368455
]
ASF GitHub Bot logged work on SUBMARINE-308:
--------------------------------------------
Author: ASF GitHub Bot
Created on: 08/Jan/20 22:41
Start Date: 08/Jan/20 22:41
Worklog Time Spent: 10m
Work Description: pingsutw commented on pull request #145: SUBMARINE-308.
Support PyTorch with TonY runtime
URL: https://github.com/apache/submarine/pull/145
### What is this PR for?
Support PyTorch with TonY runtime
### What type of PR is it?
[Bug Fix]
### Todos
* [ ] - Task
### What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-308
### How should this be tested?
https://travis-ci.org/pingsutw/hadoop-submarine/builds/634462654
### Screenshots (if appropriate)

### Questions:
* Does the licenses files need update? No
* Is there breaking changes for older versions? No
* Does this needs documentation? No
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 368455)
Remaining Estimate: 0h
Time Spent: 10m
> Support PyTorch with TonY runtime
> ---------------------------------
>
> Key: SUBMARINE-308
> URL: https://issues.apache.org/jira/browse/SUBMARINE-308
> Project: Apache Submarine
> Issue Type: Sub-task
> Components: Backend Server, Doc
> Reporter: huiyangjian
> Priority: Major
> Labels: pull-request-available
> Attachments: localization-error.jpg, no-pytorch.jpg,
> yarn-mount-error.jpg
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> 1.Hope to support stand-alone and distributed.
> 2.Hopefully, pytorch case will be available on submarine github.
> 3.Referring to the pytorch case in the documentation, the following script
> was run without success.
> {code:java}
> CLASSPATH=`hadoop classpath
> --glob`:/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar
> \
> java org.apache.submarine.client.cli.Cli \
> job run --name ${APP_NAME} \
> --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ \
> --env DOCKER_HADOOP_HDFS_HOME=/app/hadoop-3.2.1 \
> --env HADOOP_HOME=/hadoop-3.2.1 \
> --env HADOOP_YARN_HOME=/hadoop-3.2.1 \
> --env HADOOP_COMMON_HOME=/hadoop-3.2.1 \
> --env HADOOP_HDFS_HOME=/hadoop-3.2.1 \
> --env HADOOP_CONF_DIR=/hadoop-3.2.1/etc/hadoop \
> --env PYTHONUNBUFFERED="0" \
> --env TZ="Asia/Shanghai" \
> --env YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=${LOCAL_PATH}/:/home/test \
> --queue dev \
> --input_path hdfs://cluster/user/work/tensorflow/data/ \
> --docker_image
> jx-bd-hadoop13.zeus.lianjia.com:801/runonce/tf-1.13.1-pytorch-0.4-gpu-base:0.0.1
> \
> --num_workers 1 \
> --worker_resources memory=16G,vcores=2,gpu=1 \
> --worker_launch_cmd "export CLASSPATH=\$(/app/hadoop-3.2.1/bin/hadoop
> classpath --glob) && cd /home/test/pth && python ../pth/train_pth.py" \
> --localization /home/local/test/cifar10_estimator:./submarine_algorithm
> --verbose \
> --conf
> tony.containers.resources=/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar
> \
> --conf tony.application.framework=pytorch
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]