Re: Discussion on GPU backend

Deron Eriksson Wed, 18 May 2016 11:23:09 -0700

Hi Niketan,

Good idea, I think that would be the cleanest solution for now. Since JCuda
doesn't appear to be in a public maven repo, it adds a layer of difficulty
to clean integration via maven builds.


Deron


On Wed, May 18, 2016 at 10:55 AM, Niketan Pansare <npan...@us.ibm.com>
wrote:

> Hi Deron,
>
> Good points. I vote that we keep JCUDA and other accelerators we add as an
> external dependency. This means the user will have to ensure JCuda.jar in
> the class path and JCuda.DLL/JCuda.so in the LD_LIBRARY_PATH.
>
> I don't think JCuda.jar is platform-specific.
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> [image: Inactive hide details for Deron Eriksson ---05/18/2016 10:51:17
> AM---Hi, I'm wondering what would be a good way to handle JCuda]Deron
> Eriksson ---05/18/2016 10:51:17 AM---Hi, I'm wondering what would be a good
> way to handle JCuda in terms of the
>
> From: Deron Eriksson <deroneriks...@gmail.com>
> To: dev@systemml.incubator.apache.org
> Date: 05/18/2016 10:51 AM
> Subject: Re: Discussion on GPU backend
> ------------------------------
>
>
>
> Hi,
>
> I'm wondering what would be a good way to handle JCuda in terms of the
> build release packages. Currently we have 11 artifacts that we are
> building:
>   systemml-0.10.0-incubating-SNAPSHOT-inmemory.jar
>   systemml-0.10.0-incubating-SNAPSHOT-javadoc.jar
>   systemml-0.10.0-incubating-SNAPSHOT-sources.jar
>   systemml-0.10.0-incubating-SNAPSHOT-src.tar.gz
>   systemml-0.10.0-incubating-SNAPSHOT-src.zip
>   systemml-0.10.0-incubating-SNAPSHOT-standalone.jar
>   systemml-0.10.0-incubating-SNAPSHOT-standalone.tar.gz
>   systemml-0.10.0-incubating-SNAPSHOT-standalone.zip
>   systemml-0.10.0-incubating-SNAPSHOT.jar
>   systemml-0.10.0-incubating-SNAPSHOT.tar.gz
>   systemml-0.10.0-incubating-SNAPSHOT.zip
>
> It looks like JCuda is platform-specific, so you typically need different
> jars/dlls/sos/etc for each platform. If I'm understanding things correctly,
> if we generated Windows/Linux/LinuxPowerPC/MacOS-specific SystemML
> artifacts for JCuda, we'd potentially have an enormous number of artifacts.
>
> Is this something that could be potentially handled by specific profiles in
> the pom so that a user might be able to do something like "mvn clean
> package -P jcuda-windows" so that a user could be responsible for building
> the platform-specific SystemML jar for jcuda? Or is this something that
> could be handled differently, by putting the platform-specific jcuda jar on
> the classpath and any dlls or other needed libraries on the path?
>
> Deron
>
>
>
> On Tue, May 17, 2016 at 10:50 PM, Niketan Pansare <npan...@us.ibm.com>
> wrote:
>
> > Hi Luciano,
> >
> > Like all our backends, there is no change in the programming model. The
> > user submits a DML script and specifies whether she wants to use an
> > accelerator. Assuming that we compile jcuda jars into SystemML.jar, the
> > user can use GPU backend using following command:
> > spark-submit --master yarn-client ... -f MyAlgo.dml -accelerator -exec
> > hybrid_spark
> >
> > The user also needs to set LD_LIBRARY_PATH that points to JCuda DLL or so
> > files. Please see *https://issues.apache.org/jira/browse/SPARK-1720*
> > <https://issues.apache.org/jira/browse/SPARK-1720> ... For example: the
>
> > user can add following to spark-env.sh
> > export LD_LIBRARY_PATH=<path to jcuda so>:$LD_LIBRARY_PATH
> >
> > The first version of GPU backend will only accelerate CP. In this case,
> we
> > have four types of instructions:
> > 1. CP
> > 2. GPU (requires GPU on the driver)
> > 3. SPARK
> > 4. MR
> >
> > Note, the first version will require the CUDA/JCuda dependency to be
> > installed on the driver only.
> >
> > The next version will accelerate our distributed instructions as well. In
> > this case, we will have six types of instructions:
> > 1. CP
> > 2. GPU
> > 3. SPARK
> > 4. MR
> > 5. SPARK-GPU (requires GPU cluster)
> > 6. MR-GPU (requires GPU cluster)
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> >
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> >
> > [image: Inactive hide details for Luciano Resende ---05/17/2016 09:13:24
> > PM---Great to see detailed information on this topic Niketan,]Luciano
> > Resende ---05/17/2016 09:13:24 PM---Great to see detailed information on
> > this topic Niketan, I guess I have missed when you posted it in
> >
> > From: Luciano Resende <luckbr1...@gmail.com>
> > To: dev@systemml.incubator.apache.org
> > Date: 05/17/2016 09:13 PM
> > Subject: Re: Discussion on GPU backend
> > ------------------------------
>
> >
> >
> >
> > Great to see detailed information on this topic Niketan, I guess I have
> > missed when you posted it initially.
> >
> > Could you elaborate a little more on what is the programming model for
> when
> > the user wants to leverage GPU ? Also, today I can submit a job to spark
> > using --jars and it will handle copying the dependencies to the worker
> > nodes. If my application wants to leverage GPU, what extras dependencies
> > will be required on the worker nodes, and how they are going to be
> > installed/updated on the Spark cluster ?
> >
> >
> >
> > On Tue, May 3, 2016 at 1:26 PM, Niketan Pansare <npan...@us.ibm.com>
> > wrote:
> >
> > >
> > >
> > > Hi all,
> > >
> > > I have updated the design document for our GPU backend in the JIRA
> > >
> https://issues.apache.org/jira/browse/SYSTEMML-445. The implementation
>
> > > details are based on the prototype I created and is available in PR
> > >
> https://github.com/apache/incubator-systemml/pull/131. Once we are done
>
> > > with the discussion, I can clean up and separate out the GPU backend
> in a
> > > separate PR for easier review :)
> > >
> > > Here are key design points:
> > > A GPU backend would implement two abstract classes:
> > >    1.   GPUContext
> > >    2.   GPUObject
> > >
> > >
> > >
> > > The GPUContext is responsible for GPU memory management and gets
> > call-backs
> > > from SystemML's bufferpool on following methods:
> > >    1.   void acquireRead(MatrixObject mo)
> > >    2.   void acquireModify(MatrixObject mo)
> > >    3.   void release(MatrixObject mo, boolean isGPUCopyModified)
> > >    4.   void exportData(MatrixObject mo)
> > >    5.   void evict(MatrixObject mo)
> > >
> > >
> > >
> > > A GPUObject (like RDDObject and BroadcastObject) is stored in
> > CacheableData
> > > object. It contains following methods that are called back from the
> > > corresponding GPUContext:
> > >    1.   void allocateMemoryOnDevice()
> > >    2.   void deallocateMemoryOnDevice()
> > >    3.   long getSizeOnDevice()
> > >    4.   void copyFromHostToDevice()
> > >    5.   void copyFromDeviceToHost()
> > >
> > >
> > >
> > > In the initial implementation, we will add JCudaContext and
> JCudaPointer
> > > that will extend the above abstract classes respectively. The
> > JCudaContext
> > > will be created by ExecutionContextFactory depending on the
> > user-specified
> > > accelarator. Analgous to MR/SPARK/CP, we will add a new ExecType: GPU
> and
> > > implement GPU instructions.
> > >
> > > The above design is general enough so that other people can implement
> > > custom accelerators (for example: OpenCL) and also follows the design
> > > principles of our CP bufferpool.
> > >
> > > Thanks,
> > >
> > > Niketan Pansare
> > > IBM Almaden Research Center
> > > E-mail: npansar At us.ibm.com
> > >
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> > >
> >
> >
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
> >
> >
> >
>
>
>
>

Re: Discussion on GPU backend

Reply via email to