Hello Raistmer,

I am sorry if I have not been clear enough in my responses.  We have _already_ 
done what you are requesting, but not the way you suggest.  Doing it the way 
you suggest would be incompatible with existing applications, servers and older 
versions of BOINC and so would break older code.  Everything we have done is 
fully backward compatible.

Under recent versions of BOINC, we have added more values to the app_info.xml 
file that the BOINC client writes to each slot.  Let us use as an example the 
case we have been discussing in the SETI forum, where the user has 2 ATI GPUs.  
The first GPU (ATI GPU 0) is capable of (and recognized by) CAL but not OpenCL. 
 The second ATI GPU (GPU 1) is recognized by and capable of both CAL and OpenCL.

Thus, my test version of BOINC _correctly_ reports that ATI GPU 1 is the only 
OpenCL capable ATI GPU:
> CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 
> 992MB available, 348 GFLOPS peak)
> CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 
> 1024MB, 992MB available, 960 GFLOPS peak)
> OpenCL: AMD/ATI GPU 1: ATI Radeon HD 4600 series (R730) (driver version CAL 
> 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB available, 
> 960 GFLOPS peak)

His app_info.xml file will now contain the following values specifying that the 
application should use the first ATI GPU, which has ATI gpu_device_num 1 and 
which also has ATI gpu_opencl_dev_index of 0:
<gpu_type>ATI</gpu_type>
<gpu_device_num>1</gpu_device_num>
<gpu_opencl_dev_index>0</gpu_opencl_dev_index>

Older versions of init_data.xml don't have gpu_opencl_dev_index field.  Still 
older versions of init_data.xml don't have the gpu_device_num or the gpu_type 
field.

So those versions of BOINC passed the gpu_device_num to the application in the 
command line.  If the value of gpu_device_num was 1, they would pass "--device 
1" in the command line.

For backward compatibility, BOINC _still_ passes the gpu_device_num to the 
application in the command line; in our example case it _still_ passes 
"--device 1" in the command line.  To do anything different would break 
compatibility with older applications!

Around August 2011, we realized that passing the device number is not 
sufficient if a user had both ATI and NVIDIA GPUs on the same computer, so we 
created the API:
   int boinc_get_opencl_ids(cl_device_id* device, cl_platform_id* platform);
which gets the GPU vendor and device number from the init_data.xml file.

In January 2012, we discovered that on Macs, Apple's OpenCL does not support 
some NVIDIA GPUs which CUDA does support, so we added the gpu_opencl_dev_index 
field.  This allowed the boinc_get_opencl_ids() API to handle these correctly.  
OpenCL project applications did not need to worry about this, as long as they 
were linked with a current version of boinc_get_opencl_ids().  When used with 
older versions of BOINC which do not provide the gpu_opencl_dev_index field, 
boinc_get_opencl_ids() reverts to using only gpu_device_num to be as backward 
compatible as possible.

But in December 2012, we realized that boinc_get_opencl_ids() was not 
compatible with very old clients which did not provide the gpu_device_num or 
the gpu_type field.  So we deprecated the old boinc_get_opencl_ids() API and 
added a new version which takes 5 arguments:
  int boinc_get_opencl_ids(int argc, char** argv, int type, cl_device_id* 
device, cl_platform_id* platform);

Passing in the same argv and argc which were passed to the application allows 
this function to use the value of --device from the command line for 
compatibility with very old BOINC clients which did not have the gpu_device_num 
field in the init_data.xml file.  This gives us even better backward 
compatibility than we had before.  And allowing the project application to pass 
in the type (NVIDIA, ATI or Intel) allows it to work with older BOINC clients 
which did not have the gpu_type in the init_data.xml file.

This newest boinc_get_opencl_ids() API has an added feature.  If your OpenCL 
application can run on any vendor's GPU, then you can create a plan class 
telling BOINC that the vendor (gpu_type) does not matter.  On any version of 
BOINC new enough to put the gpu_type in the init_data.xml file, that one 
application will run on whichever GPU is assigned by BOINC; you will no longer 
need separate copies of the same OpenCL application for each GPU vendor.

I looked at the source code for your OpenCL and Brook anonymous platform SETI 
Astropulse applications.  I see they do not examine the --device argument 
directly, but instead call the older version of boinc_get_opencl_ids() with 2 
arguments.  I strongly recommend you update to the newer, 5 argument version to 
have backward compatibility with even older versions of the BOINC client.

A release of the BOINC client in the near future will handle most situations 
where one or more ATI/AMD GPUs support CAL but not OpenCL.  Both the _existing_ 
2-argument and the _existing_ 5-argument versions of boinc_get_opencl_ids() 
will take advantage of the improved GPU detection logic in this new BOINC 
client.

I have one more suggestion.  In your OpenCL anonymous platform SETI Astropulse 
application, the application writes "BOINC assigns device %d" with the value of 
BOINCs_device, which is the value of the gpu_opencl_dev_index.  This is 
confusing to users, who have seen the GPUs identified by their physical device 
number gpu_device_num in the Event Log.  It would be better if the application 
would display the physical device number, and use the gpu_opencl_dev_index only 
internally.  You can get the value of gpu_device_num either from the --device 
command-line argument, or from the gpu_device_num field of the init_data struct.

Cheers,
--Charlie

On Apr 16, 2013, at 1:12 AM, Raistmer the Sorcerer wrote:
> >The reason BOINC _must_ use the same index for the same physical GPU is to 
> >prevent assigning the same physical GPU to more than one task at a time. 
> >This is the number reported by --device, and is the same as the index of CAL 
> >or CUDA capable GPUs. 
> 
> BOINC - yes (inside scheduler), but should BOINc report that physical number  
> to scientific apps? No. It should not!
> For what reason --device N should mean PHYSICAL DEVICE ?
> What I propose is to set --device N meaning as next: index to device array, 
> recived by that enumeration API that app of corresponding type uses.
> That is, if app is CAL app than --device N means index into array of CAL 
> devices.
> If it's CUDA app then --device is index to array of OpenCL devices (of 
> correspnding type NV, ATi, or intel_gpu).
> And so on.
> Look!
> Currently we can have NV GPU + ATI GPU in the same OS. So, 2 physical 
> devices. 
> But each of NV and ATi apps will recive --device 0 ! As it should be if 
> device will be defined as I propose, not as just "physical device". There are 
> 2 physical devices of different types.
> In case of CAL and OpenCL there are too 2 different physocal devices. 2 
> devices of CAL type and 1 device of OpenCL type. BOINC (and ONLY BOINC 
> CLIENT) should know that device X from CAL list is same as device Y from 
> OpenCL list.
> 
> 
> As of BOINC version 7.0.12, we have added a second index, which is the index 
> of only openCL-capable GPUs. In the above example, this would have the value 
> 0 for the HD 4600, and this value provides the API-specific index Raistmer 
> requests.
> 
> The reasons that we have deprecated the use of --device and now require GPU 
> applications to instead call boinc_get_opencl_ids(int argc, char** argv, int 
> type, cl_device_id* device, cl_platform_id* platform). It also optionally 
> allows an application to offer a plan class allowing it to run on all OpenCL 
> capable GPUs, not just from one vendor.
> 
> The reason for the change is that this newer API deals automatically with the 
> possible difference between the CAL or CUDA device index and the OpenCL 
> device index. As the comments in the source file explain:
> // A few complicating factors:
> // Windows & Linux have a separate OpenCL platform for each vendor
> // (NVIDIA, AMD, Intel).
> // Mac has only one platform (Apple) which reports GPUs from all vendors.
> //
> // In all systems, opencl_device_indexes start at 0 for each platform
> // and device_nums start at 0 for each vendor.
> //
> // On Macs, OpenCL does not always recognize all GPU models detected by 
> // CUDA, so a device_num may not correspond to its opencl_device_index 
> // even if all GPUs are from NVIDIA.
> 
> I will add to this that we have recently learned that AMD's OpenCL does not 
> always recognize all GPU models detected by CAL, so a device_num may not 
> correspond to its opencl_device_index even if all GPUs are from ATI/AMD.
> 
> NOTE: The new boinc_get_opencl_ids() API is 100% backward compatible with 
> older versions of the BOINC client. From the source file's comments:
> 
> // This version is compatible with older clients.
> // Usage:
> // Pass the argc and argv received from the BOINC client
> // type: may be PROC_TYPE_NVIDIA_GPU, PROC_TYPE_AMD_GPU or PROC_TYPE_INTEL_GPU
> // (it may also be 0, but then it will fail on older clients.)
> //
> // The argc, argv and type arguments are ignored for 7.0.12 or later clients.
> //
> // returns
> // - 0 if success
> // - ERR_FOPEN if init_data.xml missing
> // - ERR_XML_PARSE if can't parse init_data.xml
> // - CL_INVALID_DEVICE_TYPE if unable to get gpu_type information
> // - ERR_NOT_FOUND if unable to get opencl_device_index or gpu device_num
> // - an OpenCL error number if OpenCL error
> 
> Finally, we have added two new prototype plan classes: opencl_nvidia_101 and 
> opencl_ati_101 for app versions that run on NVIDIA or ATI GPUs using OpenCL 
> 1.1, using at most 256MB of GPU RAM. You can modify sched_customize.cpp to 
> change these parameters or add your own plan classes, such as for OpenCL 1.0 
> or 1.2. These plan classes are not backward compatible and require BOINC 
> 7.0.x.
> 
> Information about all of the above can be found at 
> <http://boinc.berkeley.edu/trac/wiki/OpenclApps>.
> 
> I hope this answers your questions.
> 
> Cheers,
> --Charlie
> 
> On Apr 15, 2013, at 6:48 AM, Raistmer the Sorcerer wrote:
> > Regarding deprecation of --device N option:
> > can anyone provide description for what reason it was done?
> > 
> > Each API contains own enumeration.
> > Each enumeration (in particular device class) starts from zero (0).
> > What prevents BOINC to report --device N to app correctly if BOINC knows 
> > for what accelerator class designed ?
> > In view of recent CAL/OpenCL issue (or in view OSX CUDA OpenCL issue, no 
> > matter):
> > --device N for CAL should be 1 and 0 (2 CAL enabled devices installed);
> > --device N for OpenCL should be only 0 (1 OpenCL capable device installed). 
> > BOINC keeps track what device is what physical device, app just need device 
> > number in own API enumeration scheme.
> > For what reason (for example) my OpenCL app should know that there are 
> > another, non-OpenCL device in system ? It should not. Hence, no "device 1", 
> > but "device 0". It doesn't kere about keyboard or mouse, it should not care 
> > about CAL GPU too. It's BOINC mission not to allocate same physical device 
> > both as CAL and OpenCL in the same time.
> > Currently app recives OpenCL context handler. Ok, no probs with that. But 
> > (!) ensure back compatibility! such OpenCL context should contain same 
> > device as OpenCL enumeration API would provide if --device contains offset 
> > in device list. What particular issues do you see with this? But providing 
> > both --device _and_ OpenCL context (for what reason context - separrate 
> > question but perhaps sometimes it's convenient indeed) you provide at least 
> > partial backward compatibility. If one can provide backward compatibility 
> > it should be done! 
> > All this (BOINC) about using AVAILABLE user resources, already available 
> > ones. Not about requesting users to upgrade OS, but new hardware and so on. 
> > Backward compatibility should be keystone of BOINC concept. And all these 
> > nor really needed "deprecations" will play badly with existing userbase.
> 

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to