[ https://issues.apache.org/jira/browse/SUBMARINE-308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
huiyangjian updated SUBMARINE-308: ---------------------------------- Description: 1.Hope to support stand-alone and distributed. 2.Hopefully, pytorch case will be available on submarine github. 3.Referring to the pytorch case in the documentation, the following script was run without success. {code:java} CLASSPATH=`hadoop classpath --glob`:/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar \ java org.apache.submarine.client.cli.Cli \ job run --name ${APP_NAME} \ --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ \ --env DOCKER_HADOOP_HDFS_HOME=/app/hadoop-3.2.1 \ --env HADOOP_HOME=/hadoop-3.2.1 \ --env HADOOP_YARN_HOME=/hadoop-3.2.1 \ --env HADOOP_COMMON_HOME=/hadoop-3.2.1 \ --env HADOOP_HDFS_HOME=/hadoop-3.2.1 \ --env HADOOP_CONF_DIR=/hadoop-3.2.1/etc/hadoop \ --env PYTHONUNBUFFERED="0" \ --env TZ="Asia/Shanghai" \ --env YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=${LOCAL_PATH}/:/home/test \ --queue dev \ --input_path hdfs://cluster/user/work/tensorflow/data/ \ --docker_image jx-bd-hadoop13.zeus.lianjia.com:801/runonce/tf-1.13.1-pytorch-0.4-gpu-base:0.0.1 \ --num_workers 1 \ --worker_resources memory=16G,vcores=2,gpu=1 \ --worker_launch_cmd "export CLASSPATH=\$(/app/hadoop-3.2.1/bin/hadoop classpath --glob) && cd /home/test/pth && python ../pth/train_pth.py" \ --localization /home/local/test/cifar10_estimator:./submarine_algorithm --verbose \ --conf tony.containers.resources=/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar \ --conf tony.application.framework=pytorch {code} was: 1.Hope to support stand-alone and distributed. 2.Hopefully, pytorch case will be available on submarine github. 3.Referring to the pytorch case in the documentation, the following script was run without success. {code:java} //代码占位符 CLASSPATH=`hadoop classpath --glob`:/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar \ java org.apache.submarine.client.cli.Cli \ job run --name ${APP_NAME} \ --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ \ --env DOCKER_HADOOP_HDFS_HOME=/app/hadoop-3.2.1 \ --env HADOOP_HOME=/hadoop-3.2.1 \ --env HADOOP_YARN_HOME=/hadoop-3.2.1 \ --env HADOOP_COMMON_HOME=/hadoop-3.2.1 \ --env HADOOP_HDFS_HOME=/hadoop-3.2.1 \ --env HADOOP_CONF_DIR=/hadoop-3.2.1/etc/hadoop \ --env PYTHONUNBUFFERED="0" \ --env TZ="Asia/Shanghai" \ --env YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=${LOCAL_PATH}/:/home/test \ --queue dev \ --input_path hdfs://cluster/user/work/tensorflow/data/ \ --docker_image jx-bd-hadoop13.zeus.lianjia.com:801/runonce/tf-1.13.1-pytorch-0.4-gpu-base:0.0.1 \ --num_workers 1 \ --worker_resources memory=16G,vcores=2,gpu=1 \ --worker_launch_cmd "export CLASSPATH=\$(/app/hadoop-3.2.1/bin/hadoop classpath --glob) && cd /home/test/pth && python ../pth/train_pth.py" \ --localization /home/local/test/cifar10_estimator:./submarine_algorithm --verbose \ --conf tony.containers.resources=/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar \ --conf tony.application.framework=pytorch {code} > support PyTorch with TonY runtime > --------------------------------- > > Key: SUBMARINE-308 > URL: https://issues.apache.org/jira/browse/SUBMARINE-308 > Project: Apache Submarine > Issue Type: New Feature > Components: Doc, Submarine Server > Reporter: huiyangjian > Priority: Major > Fix For: 0.3.0 > > > 1.Hope to support stand-alone and distributed. > 2.Hopefully, pytorch case will be available on submarine github. > 3.Referring to the pytorch case in the documentation, the following script > was run without success. > {code:java} > CLASSPATH=`hadoop classpath > --glob`:/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar > \ > java org.apache.submarine.client.cli.Cli \ > job run --name ${APP_NAME} \ > --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/ \ > --env DOCKER_HADOOP_HDFS_HOME=/app/hadoop-3.2.1 \ > --env HADOOP_HOME=/hadoop-3.2.1 \ > --env HADOOP_YARN_HOME=/hadoop-3.2.1 \ > --env HADOOP_COMMON_HOME=/hadoop-3.2.1 \ > --env HADOOP_HDFS_HOME=/hadoop-3.2.1 \ > --env HADOOP_CONF_DIR=/hadoop-3.2.1/etc/hadoop \ > --env PYTHONUNBUFFERED="0" \ > --env TZ="Asia/Shanghai" \ > --env YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=${LOCAL_PATH}/:/home/test \ > --queue dev \ > --input_path hdfs://cluster/user/work/tensorflow/data/ \ > --docker_image > jx-bd-hadoop13.zeus.lianjia.com:801/runonce/tf-1.13.1-pytorch-0.4-gpu-base:0.0.1 > \ > --num_workers 1 \ > --worker_resources memory=16G,vcores=2,gpu=1 \ > --worker_launch_cmd "export CLASSPATH=\$(/app/hadoop-3.2.1/bin/hadoop > classpath --glob) && cd /home/test/pth && python ../pth/train_pth.py" \ > --localization /home/local/test/cifar10_estimator:./submarine_algorithm > --verbose \ > --conf > tony.containers.resources=/home/bin/hadoop/share/hadoop/yarn/submarine-all-0.3.0-SNAPSHOT-hadoop-3.2.jar > \ > --conf tony.application.framework=pytorch > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@submarine.apache.org For additional commands, e-mail: dev-h...@submarine.apache.org