I am trying to port some cuda c code to ispcc to run spmd program on cpu.
I am having a few doubts like (answers to any/some of these will help a
lot) ;–
1. Does the Ispc code need error checking mechanism like
CUDA_CHECK_ERRORS which is useful for checking proper memory allocation on
device/host.
2. I believe *global* qualifier is for compiler to know to run function
on device and __syncthreads() is for barrier synchronization among
instances which are probably not needed in ispcc so is there something I am
missing.?
3. I think threadidx in cuda c is equivalent to programIndex in ispcc as
cuda uses threads in blocks whereas ispc uses program instances in gangs
however I am having trouble porting this statement (const unsigned int id =
32 * blockIdx.x + threadIdx.x;). My thought on this is that 1 block
contains 32 threads and each id is for unique thread but in ispcc I would
use const unsigned int id = programIndex; Will this be correct??.
4. Should I use “uniform” keyword for declaring global variables as
these will be shared across the gang . Also *shared* keyword is used in
cuda c to to make variables reside in shared memory easing communication
between threads in the same block , is some similar mechanism possible in
ispcc or should I ignore it ??
5. atomicAdd like atomic operations used in cuda C is good for operating
safely on memory without being affected by other threads. Its equivalents
in ispcc are atomic_add_global and atomic_add_local. Which one should be
used in porting code ??
6. Regarding memory allocation since there is no host/device i believe
there is no need for using two variables like var1 and device_var1 for same
value and then using memcopy, instead I think of using single variable and
also for porting statements like "cudaMalloc (&var, 4 * sizeof(float))"
i'll use "float* var = new float[4] instead'. Am I thinking in right
direction?.
7. Statements like cudaSetDevice(0) and cudaHostAllocPortable used for
device selection and allocating page-locked memory as portable are mostly
useful when dealing with multiple gpu devices. Is any similar mechanism
required in Ispcc or these should be ignored .?
8. For porting the following code(which is probably used to launch
kernels using multidimensional grids of blocks and threads of device in
cuda) to ispcc :--
const dim3 threads(32, 1);
const dim3 grid(1, 1);
initKernel<<<grid, threads>>>(first_var);
updatKernel<<<grid, threads>>>(device_var1,device_var2);
I plan to use these lines directly ;--
initKernell((first_var); // here initkernel is just the name of
function in ispc file
updatKernel((device_var1,device_var2);
Should I be using something else corresponding to thread and block
information or is this ispc function call fine??..
9. Instead of std::fill_n(used for assigning values in std c++) and
which I believe won't be present in ispcc (please correct me if I am
wrong), I am planning to use a for loop to fill the array's elements and
instead of std::ofstream ( used for writing data to files ) which is used
here;-
"std::ofstream file1("file1.csv")"; and then
"file1 << i << "," << array_of_unsigned_ints[ s ] << std::endl;" // s
is the index
I am planning to use the C-style file handling :-- FILE *file1 =
fopen("file1.csv","w") ; fputs(text,file1); fclose(file1); // where
text is defined as char text[] = strcat(strcat(strcat( i , "," ) ,
array_of_unsigned_ints[ s ]), "\n"). Will these be fine or I may do
something better ?. Also I believe for putting text which is a character
array I might need to find the maximum no. of of characters possible in an
unsigned_integer ,, so what is it ??
--
You received this message because you are subscribed to the Google Groups
"Intel SPMD Program Compiler Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/ispc-users/2b9450cd-8db3-447d-9fce-cd677d6ba16bn%40googlegroups.com.