On Apr 16, 2013, at 3:26 AM, Richard Haselgrove wrote:
> Typo alert. "The ... file that the BOINC client writes to each slot" (second 
> paragraph) is of course init_data.xml, and I suspect the sample file contents 
> likewise. Calling it app_info.xml may set people's minds running down the 
> wrong path.

Right you are!  Thank you for your eagle eye.  It seems no matter how many 
times I proof read my writing, something always slips by.

Cheers,
--Charlie

> From: Charlie Fenton <[email protected]>
> To: Raistmer the Sorcerer <[email protected]> 
> Cc: [email protected] 
> Sent: Tuesday, 16 April 2013, 11:18
> Subject: Re: [boinc_dev] Accelerator type identification issue
> 
> Hello Raistmer,
> 
> I am sorry if I have not been clear enough in my responses.  We have 
> _already_ done what you are requesting, but not the way you suggest.  Doing 
> it the way you suggest would be incompatible with existing applications, 
> servers and older versions of BOINC and so would break older code.  
> Everything we have done is fully backward compatible.
> 
> Under recent versions of BOINC, we have added more values to the app_info.xml 
> file that the BOINC client writes to each slot.  Let us use as an example the 
> case we have been discussing in the SETI forum, where the user has 2 ATI 
> GPUs.  The first GPU (ATI GPU 0) is capable of (and recognized by) CAL but 
> not OpenCL.  The second ATI GPU (GPU 1) is recognized by and capable of both 
> CAL and OpenCL.
> 
> Thus, my test version of BOINC _correctly_ reports that ATI GPU 1 is the only 
> OpenCL capable ATI GPU:
> > CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, 
> > 992MB available, 348 GFLOPS peak)
> > CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, 
> > 1024MB, 992MB available, 960 GFLOPS peak)
> > OpenCL: AMD/ATI GPU 1: ATI Radeon HD 4600 series (R730) (driver version CAL 
> > 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB 
> > available, 960 GFLOPS peak)
> 
> His app_info.xml file will now contain the following values specifying that 
> the application should use the first ATI GPU, which has ATI gpu_device_num 1 
> and which also has ATI gpu_opencl_dev_index of 0:
> <gpu_type>ATI</gpu_type>
> <gpu_device_num>1</gpu_device_num>
> <gpu_opencl_dev_index>0</gpu_opencl_dev_index>
> 
> Older versions of init_data.xml don't have gpu_opencl_dev_index field.  Still 
> older versions of init_data.xml don't have the gpu_device_num or the gpu_type 
> field.
> 
> So those versions of BOINC passed the gpu_device_num to the application in 
> the command line.  If the value of gpu_device_num was 1, they would pass 
> "--device 1" in the command line.
> 
> For backward compatibility, BOINC _still_ passes the gpu_device_num to the 
> application in the command line; in our example case it _still_ passes 
> "--device 1" in the command line.  To do anything different would break 
> compatibility with older applications!
> 
> Around August 2011, we realized that passing the device number is not 
> sufficient if a user had both ATI and NVIDIA GPUs on the same computer, so we 
> created the API:
>   int boinc_get_opencl_ids(cl_device_id* device, cl_platform_id* platform);
> which gets the GPU vendor and device number from the init_data.xml file.
> 
> In January 2012, we discovered that on Macs, Apple's OpenCL does not support 
> some NVIDIA GPUs which CUDA does support, so we added the 
> gpu_opencl_dev_index field.  This allowed the boinc_get_opencl_ids() API to 
> handle these correctly.  OpenCL project applications did not need to worry 
> about this, as long as they were linked with a current version of 
> boinc_get_opencl_ids().  When used with older versions of BOINC which do not 
> provide the gpu_opencl_dev_index field, boinc_get_opencl_ids() reverts to 
> using only gpu_device_num to be as backward compatible as possible.
> 
> But in December 2012, we realized that boinc_get_opencl_ids() was not 
> compatible with very old clients which did not provide the gpu_device_num or 
> the gpu_type field.  So we deprecated the old boinc_get_opencl_ids() API and 
> added a new version which takes 5 arguments:
>   int boinc_get_opencl_ids(int argc, char** argv, int type, cl_device_id* 
> device, cl_platform_id* platform);
> 
> Passing in the same argv and argc which were passed to the application allows 
> this function to use the value of --device from the command line for 
> compatibility with very old BOINC clients which did not have the 
> gpu_device_num field in the init_data.xml file.  This gives us even better 
> backward compatibility than we had before.  And allowing the project 
> application to pass in the type (NVIDIA, ATI or Intel) allows it to work with 
> older BOINC clients which did not have the gpu_type in the init_data.xml file.
> 
> This newest boinc_get_opencl_ids() API has an added feature.  If your OpenCL 
> application can run on any vendor's GPU, then you can create a plan class 
> telling BOINC that the vendor (gpu_type) does not matter.  On any version of 
> BOINC new enough to put the gpu_type in the init_data.xml file, that one 
> application will run on whichever GPU is assigned by BOINC; you will no 
> longer need separate copies of the same OpenCL application for each GPU 
> vendor.
> 
> I looked at the source code for your OpenCL and Brook anonymous platform SETI 
> Astropulse applications.  I see they do not examine the --device argument 
> directly, but instead call the older version of boinc_get_opencl_ids() with 2 
> arguments.  I strongly recommend you update to the newer, 5 argument version 
> to have backward compatibility with even older versions of the BOINC client.
> 
> A release of the BOINC client in the near future will handle most situations 
> where one or more ATI/AMD GPUs support CAL but not OpenCL.  Both the 
> _existing_ 2-argument and the _existing_ 5-argument versions of 
> boinc_get_opencl_ids() will take advantage of the improved GPU detection 
> logic in this new BOINC client.
> 
> I have one more suggestion.  In your OpenCL anonymous platform SETI 
> Astropulse application, the application writes "BOINC assigns device %d" with 
> the value of BOINCs_device, which is the value of the gpu_opencl_dev_index.  
> This is confusing to users, who have seen the GPUs identified by their 
> physical device number gpu_device_num in the Event Log.  It would be better 
> if the application would display the physical device number, and use the 
> gpu_opencl_dev_index only internally.  You can get the value of 
> gpu_device_num either from the --device command-line argument, or from the 
> gpu_device_num field of the init_data struct.
> 
> Cheers,
> --Charlie
> 
> On Apr 16, 2013, at 1:12 AM, Raistmer the Sorcerer wrote:
> > >The reason BOINC _must_ use the same index for the same physical GPU is to 
> > >prevent assigning the same physical GPU to more than one task at a time. 
> > >This is the number reported by --device, and is the same as the index of 
> > >CAL or CUDA capable GPUs. 
> > 
> > BOINC - yes (inside scheduler), but should BOINc report that physical 
> > number  to scientific apps? No. It should not!
> > For what reason --device N should mean PHYSICAL DEVICE ?
> > What I propose is to set --device N meaning as next: index to device array, 
> > recived by that enumeration API that app of corresponding type uses.
> > That is, if app is CAL app than --device N means index into array of CAL 
> > devices.
> > If it's CUDA app then --device is index to array of OpenCL devices (of 
> > correspnding type NV, ATi, or intel_gpu).
> > And so on.
> > Look!
> > Currently we can have NV GPU + ATI GPU in the same OS. So, 2 physical 
> > devices. 
> > But each of NV and ATi apps will recive --device 0 ! As it should be if 
> > device will be defined as I propose, not as just "physical device". There 
> > are 2 physical devices of different types.
> > In case of CAL and OpenCL there are too 2 different physocal devices. 2 
> > devices of CAL type and 1 device of OpenCL type. BOINC (and ONLY BOINC 
> > CLIENT) should know that device X from CAL list is same as device Y from 
> > OpenCL list.
> > 
> > 
> > As of BOINC version 7.0.12, we have added a second index, which is the 
> > index of only openCL-capable GPUs. In the above example, this would have 
> > the value 0 for the HD 4600, and this value provides the API-specific index 
> > Raistmer requests.
> > 
> > The reasons that we have deprecated the use of --device and now require GPU 
> > applications to instead call boinc_get_opencl_ids(int argc, char** argv, 
> > int type, cl_device_id* device, cl_platform_id* platform). It also 
> > optionally allows an application to offer a plan class allowing it to run 
> > on all OpenCL capable GPUs, not just from one vendor.
> > 
> > The reason for the change is that this newer API deals automatically with 
> > the possible difference between the CAL or CUDA device index and the OpenCL 
> > device index. As the comments in the source file explain:
> > // A few complicating factors:
> > // Windows & Linux have a separate OpenCL platform for each vendor
> > // (NVIDIA, AMD, Intel).
> > // Mac has only one platform (Apple) which reports GPUs from all vendors.
> > //
> > // In all systems, opencl_device_indexes start at 0 for each platform
> > // and device_nums start at 0 for each vendor.
> > //
> > // On Macs, OpenCL does not always recognize all GPU models detected by 
> > // CUDA, so a device_num may not correspond to its opencl_device_index 
> > // even if all GPUs are from NVIDIA.
> > 
> > I will add to this that we have recently learned that AMD's OpenCL does not 
> > always recognize all GPU models detected by CAL, so a device_num may not 
> > correspond to its opencl_device_index even if all GPUs are from ATI/AMD.
> > 
> > NOTE: The new boinc_get_opencl_ids() API is 100% backward compatible with 
> > older versions of the BOINC client. From the source file's comments:
> > 
> > // This version is compatible with older clients.
> > // Usage:
> > // Pass the argc and argv received from the BOINC client
> > // type: may be PROC_TYPE_NVIDIA_GPU, PROC_TYPE_AMD_GPU or 
> > PROC_TYPE_INTEL_GPU
> > // (it may also be 0, but then it will fail on older clients.)
> > //
> > // The argc, argv and type arguments are ignored for 7.0.12 or later 
> > clients.
> > //
> > // returns
> > // - 0 if success
> > // - ERR_FOPEN if init_data.xml missing
> > // - ERR_XML_PARSE if can't parse init_data.xml
> > // - CL_INVALID_DEVICE_TYPE if unable to get gpu_type information
> > // - ERR_NOT_FOUND if unable to get opencl_device_index or gpu device_num
> > // - an OpenCL error number if OpenCL error
> > 
> > Finally, we have added two new prototype plan classes: opencl_nvidia_101 
> > and opencl_ati_101 for app versions that run on NVIDIA or ATI GPUs using 
> > OpenCL 1.1, using at most 256MB of GPU RAM. You can modify 
> > sched_customize.cpp to change these parameters or add your own plan 
> > classes, such as for OpenCL 1.0 or 1.2. These plan classes are not backward 
> > compatible and require BOINC 7.0.x.
> > 
> > Information about all of the above can be found at 
> > <http://boinc.berkeley.edu/trac/wiki/OpenclApps>.
> > 
> > I hope this answers your questions.
> > 
> > Cheers,
> > --Charlie
> > 
> > On Apr 15, 2013, at 6:48 AM, Raistmer the Sorcerer wrote:
> > > Regarding deprecation of --device N option:
> > > can anyone provide description for what reason it was done?
> > > 
> > > Each API contains own enumeration.
> > > Each enumeration (in particular device class) starts from zero (0).
> > > What prevents BOINC to report --device N to app correctly if BOINC knows 
> > > for what accelerator class designed ?
> > > In view of recent CAL/OpenCL issue (or in view OSX CUDA OpenCL issue, no 
> > > matter):
> > > --device N for CAL should be 1 and 0 (2 CAL enabled devices installed);
> > > --device N for OpenCL should be only 0 (1 OpenCL capable device 
> > > installed). BOINC keeps track what device is what physical device, app 
> > > just need device number in own API enumeration scheme.
> > > For what reason (for example) my OpenCL app should know that there are 
> > > another, non-OpenCL device in system ? It should not. Hence, no "device 
> > > 1", but "device 0". It doesn't kere about keyboard or mouse, it should 
> > > not care about CAL GPU too. It's BOINC mission not to allocate same 
> > > physical device both as CAL and OpenCL in the same time.
> > > Currently app recives OpenCL context handler. Ok, no probs with that. But 
> > > (!) ensure back compatibility! such OpenCL context should contain same 
> > > device as OpenCL enumeration API would provide if --device contains 
> > > offset in device list. What particular issues do you see with this? But 
> > > providing both --device _and_ OpenCL context (for what reason context - 
> > > separrate question but perhaps sometimes it's convenient indeed) you 
> > > provide at least partial backward compatibility. If one can provide 
> > > backward compatibility it should be done! 
> > > All this (BOINC) about using AVAILABLE user resources, already available 
> > > ones. Not about requesting users to upgrade OS, but new hardware and so 
> > > on. Backward compatibility should be keystone of BOINC concept. And all 
> > > these nor really needed "deprecations" will play badly with existing 
> > > userbase.
> > 
> 
> _______________________________________________
> boinc_dev mailing list
> [email protected]
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
> 
> 

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to