http://sbel.wisc.edu/Forum/index.php?topic=19.0

Topic: getting CUDA assembly code with nvcc  (Read 634 times)
Dan Negrut
Jr. Member
**
Posts: 62



View Profile
« on: September 17, 2008, 04:51:55 PM »


Last lecture i mentioned that you want to have as much as possible your variables stored in registers and/or shared memory.
However, if you have too many variables that you hope to be stored in the registers they might end up becoming local variables, which are stored in the global memory.   This is going to come at a performance price: what you hoped to access with no overhead and wanted to use intensively, ends up stored in the global memory and still used intensively.  From where the performance hit.

In order to be able to figure out how many registers your code uses, as well as how much shared memory you use, you need to take a look at assembly code that is output as a result of an invocation of the nvcc driver on your cuda source file.

In other words, let's say that your typical command line to compile the .cu file and turn it into a binary looks like this (very typical, get it from Developer Studio from the build log):
nvcc.exe -ccbin "C:\Program Files\Microsoft Visual Studio 8\VC\bin"  -c -D_CONSOLE -Xcompiler "/EHsc /W3 /nologo /Wp64 /O2 /Zi   /MT " -I"C:\CUDA\include" -I"C:\Progra~1\NVIDIA~1\NVIDIA~1\common\inc" -o Release\driver.obj driver.cu

This needs to be modified to:
nvcc.exe -ptx  -D_CONSOLE -Xcompiler "/EHsc /W3 /nologo /Wp64 /O2 /Zi  /MT " -I"C:\CUDA\include" -I"C:\Progra~1\NVIDIA~1\NVIDIA~1\common\inc" driver.cu

The most important thing is that you tell the nvcc driver to not generate a binary (with -ccbin) but rather generate assembly code (through the setting -ptx).  Also you need to drop some fluff to get it to fly.

Note that i also attached a snapshot of Developer Studio that shows the BuildLog.htm file (you get this when you build your solution in Dev Studio)

I hope this helps.
Dan


Logged

Reply via email to