Typo alert. "The ... file that the BOINC client writes to each slot" (second paragraph) is of course init_data.xml, and I suspect the sample file contents likewise. Calling it app_info.xml may set people's minds running down the wrong path.
>________________________________ > From: Charlie Fenton <[email protected]> >To: Raistmer the Sorcerer <[email protected]> >Cc: [email protected] >Sent: Tuesday, 16 April 2013, 11:18 >Subject: Re: [boinc_dev] Accelerator type identification issue > > >Hello Raistmer, > >I am sorry if I have not been clear enough in my responses. We have _already_ >done what you are requesting, but not the way you suggest. Doing it the way >you suggest would be incompatible with existing applications, servers and >older versions of BOINC and so would break older code. Everything we have >done is fully backward compatible. > >Under recent versions of BOINC, we have added more values to the app_info.xml >file that the BOINC client writes to each slot. Let us use as an example the >case we have been discussing in the SETI forum, where the user has 2 ATI GPUs. > The first GPU (ATI GPU 0) is capable of (and recognized by) CAL but not >OpenCL. The second ATI GPU (GPU 1) is recognized by and capable of both CAL >and OpenCL. > >Thus, my test version of BOINC _correctly_ reports that ATI GPU 1 is the only >OpenCL capable ATI GPU: >> CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, >> 992MB available, 348 GFLOPS peak) >> CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, >> 1024MB, 992MB available, 960 GFLOPS peak) >> OpenCL: AMD/ATI GPU 1: ATI Radeon HD 4600 series (R730) (driver version CAL >> 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB >> available, 960 GFLOPS peak) > >His app_info.xml file will now contain the following values specifying that >the application should use the first ATI GPU, which has ATI gpu_device_num 1 >and which also has ATI gpu_opencl_dev_index of 0: ><gpu_type>ATI</gpu_type> ><gpu_device_num>1</gpu_device_num> ><gpu_opencl_dev_index>0</gpu_opencl_dev_index> > >Older versions of init_data.xml don't have gpu_opencl_dev_index field. Still >older versions of init_data.xml don't have the gpu_device_num or the gpu_type >field. > >So those versions of BOINC passed the gpu_device_num to the application in the >command line. If the value of gpu_device_num was 1, they would pass "--device >1" in the command line. > >For backward compatibility, BOINC _still_ passes the gpu_device_num to the >application in the command line; in our example case it _still_ passes >"--device 1" in the command line. To do anything different would break >compatibility with older applications! > >Around August 2011, we realized that passing the device number is not >sufficient if a user had both ATI and NVIDIA GPUs on the same computer, so we >created the API: > int boinc_get_opencl_ids(cl_device_id* device, cl_platform_id* platform); >which gets the GPU vendor and device number from the init_data.xml file. > >In January 2012, we discovered that on Macs, Apple's OpenCL does not support >some NVIDIA GPUs which CUDA does support, so we added the gpu_opencl_dev_index >field. This allowed the boinc_get_opencl_ids() API to handle these correctly. > OpenCL project applications did not need to worry about this, as long as they >were linked with a current version of boinc_get_opencl_ids(). When used with >older versions of BOINC which do not provide the gpu_opencl_dev_index field, >boinc_get_opencl_ids() reverts to using only gpu_device_num to be as backward >compatible as possible. > >But in December 2012, we realized that boinc_get_opencl_ids() was not >compatible with very old clients which did not provide the gpu_device_num or >the gpu_type field. So we deprecated the old boinc_get_opencl_ids() API and >added a new version which takes 5 arguments: > int boinc_get_opencl_ids(int argc, char** argv, int type, cl_device_id* >device, cl_platform_id* platform); > >Passing in the same argv and argc which were passed to the application allows >this function to use the value of --device from the command line for >compatibility with very old BOINC clients which did not have the >gpu_device_num field in the init_data.xml file. This gives us even better >backward compatibility than we had before. And allowing the project >application to pass in the type (NVIDIA, ATI or Intel) allows it to work with >older BOINC clients which did not have the gpu_type in the init_data.xml file. > >This newest boinc_get_opencl_ids() API has an added feature. If your OpenCL >application can run on any vendor's GPU, then you can create a plan class >telling BOINC that the vendor (gpu_type) does not matter. On any version of >BOINC new enough to put the gpu_type in the init_data.xml file, that one >application will run on whichever GPU is assigned by BOINC; you will no longer >need separate copies of the same OpenCL application for each GPU vendor. > >I looked at the source code for your OpenCL and Brook anonymous platform SETI >Astropulse applications. I see they do not examine the --device argument >directly, but instead call the older version of boinc_get_opencl_ids() with 2 >arguments. I strongly recommend you update to the newer, 5 argument version >to have backward compatibility with even older versions of the BOINC client. > >A release of the BOINC client in the near future will handle most situations >where one or more ATI/AMD GPUs support CAL but not OpenCL. Both the >_existing_ 2-argument and the _existing_ 5-argument versions of >boinc_get_opencl_ids() will take advantage of the improved GPU detection logic >in this new BOINC client. > >I have one more suggestion. In your OpenCL anonymous platform SETI Astropulse >application, the application writes "BOINC assigns device %d" with the value >of BOINCs_device, which is the value of the gpu_opencl_dev_index. This is >confusing to users, who have seen the GPUs identified by their physical device >number gpu_device_num in the Event Log. It would be better if the application >would display the physical device number, and use the gpu_opencl_dev_index >only internally. You can get the value of gpu_device_num either from the >--device command-line argument, or from the gpu_device_num field of the >init_data struct. > >Cheers, >--Charlie > >On Apr 16, 2013, at 1:12 AM, Raistmer the Sorcerer wrote: >> >The reason BOINC _must_ use the same index for the same physical GPU is to >> >prevent assigning the same physical GPU to more than one task at a time. >> >This is the number reported by --device, and is the same as the index of >> >CAL or CUDA capable GPUs. >> >> BOINC - yes (inside scheduler), but should BOINc report that physical number >> to scientific apps? No. It should not! >> For what reason --device N should mean PHYSICAL DEVICE ? >> What I propose is to set --device N meaning as next: index to device array, >> recived by that enumeration API that app of corresponding type uses. >> That is, if app is CAL app than --device N means index into array of CAL >> devices. >> If it's CUDA app then --device is index to array of OpenCL devices (of >> correspnding type NV, ATi, or intel_gpu). >> And so on. >> Look! >> Currently we can have NV GPU + ATI GPU in the same OS. So, 2 physical >> devices. >> But each of NV and ATi apps will recive --device 0 ! As it should be if >> device will be defined as I propose, not as just "physical device". There >> are 2 physical devices of different types. >> In case of CAL and OpenCL there are too 2 different physocal devices. 2 >> devices of CAL type and 1 device of OpenCL type. BOINC (and ONLY BOINC >> CLIENT) should know that device X from CAL list is same as device Y from >> OpenCL list. >> >> >> As of BOINC version 7.0.12, we have added a second index, which is the index >> of only openCL-capable GPUs. In the above example, this would have the value >> 0 for the HD 4600, and this value provides the API-specific index Raistmer >> requests. >> >> The reasons that we have deprecated the use of --device and now require GPU >> applications to instead call boinc_get_opencl_ids(int argc, char** argv, int >> type, cl_device_id* device, cl_platform_id* platform). It also optionally >> allows an application to offer a plan class allowing it to run on all OpenCL >> capable GPUs, not just from one vendor. >> >> The reason for the change is that this newer API deals automatically with >> the possible difference between the CAL or CUDA device index and the OpenCL >> device index. As the comments in the source file explain: >> // A few complicating factors: >> // Windows & Linux have a separate OpenCL platform for each vendor >> // (NVIDIA, AMD, Intel). >> // Mac has only one platform (Apple) which reports GPUs from all vendors. >> // >> // In all systems, opencl_device_indexes start at 0 for each platform >> // and device_nums start at 0 for each vendor. >> // >> // On Macs, OpenCL does not always recognize all GPU models detected by >> // CUDA, so a device_num may not correspond to its opencl_device_index >> // even if all GPUs are from NVIDIA. >> >> I will add to this that we have recently learned that AMD's OpenCL does not >> always recognize all GPU models detected by CAL, so a device_num may not >> correspond to its opencl_device_index even if all GPUs are from ATI/AMD. >> >> NOTE: The new boinc_get_opencl_ids() API is 100% backward compatible with >> older versions of the BOINC client. From the source file's comments: >> >> // This version is compatible with older clients. >> // Usage: >> // Pass the argc and argv received from the BOINC client >> // type: may be PROC_TYPE_NVIDIA_GPU, PROC_TYPE_AMD_GPU or >> PROC_TYPE_INTEL_GPU >> // (it may also be 0, but then it will fail on older clients.) >> // >> // The argc, argv and type arguments are ignored for 7.0.12 or later clients. >> // >> // returns >> // - 0 if success >> // - ERR_FOPEN if init_data.xml missing >> // - ERR_XML_PARSE if can't parse init_data.xml >> // - CL_INVALID_DEVICE_TYPE if unable to get gpu_type information >> // - ERR_NOT_FOUND if unable to get opencl_device_index or gpu device_num >> // - an OpenCL error number if OpenCL error >> >> Finally, we have added two new prototype plan classes: opencl_nvidia_101 and >> opencl_ati_101 for app versions that run on NVIDIA or ATI GPUs using OpenCL >> 1.1, using at most 256MB of GPU RAM. You can modify sched_customize.cpp to >> change these parameters or add your own plan classes, such as for OpenCL 1.0 >> or 1.2. These plan classes are not backward compatible and require BOINC >> 7.0.x. >> >> Information about all of the above can be found at >> <http://boinc.berkeley.edu/trac/wiki/OpenclApps>. >> >> I hope this answers your questions. >> >> Cheers, >> --Charlie >> >> On Apr 15, 2013, at 6:48 AM, Raistmer the Sorcerer wrote: >> > Regarding deprecation of --device N option: >> > can anyone provide description for what reason it was done? >> > >> > Each API contains own enumeration. >> > Each enumeration (in particular device class) starts from zero (0). >> > What prevents BOINC to report --device N to app correctly if BOINC knows >> > for what accelerator class designed ? >> > In view of recent CAL/OpenCL issue (or in view OSX CUDA OpenCL issue, no >> > matter): >> > --device N for CAL should be 1 and 0 (2 CAL enabled devices installed); >> > --device N for OpenCL should be only 0 (1 OpenCL capable device >> > installed). BOINC keeps track what device is what physical device, app >> > just need device number in own API enumeration scheme. >> > For what reason (for example) my OpenCL app should know that there are >> > another, non-OpenCL device in system ? It should not. Hence, no "device >> > 1", but "device 0". It doesn't kere about keyboard or mouse, it should not >> > care about CAL GPU too. It's BOINC mission not to allocate same physical >> > device both as CAL and OpenCL in the same time. >> > Currently app recives OpenCL context handler. Ok, no probs with that. But >> > (!) ensure back compatibility! such OpenCL context should contain same >> > device as OpenCL enumeration API would provide if --device contains offset >> > in device list. What particular issues do you see with this? But providing >> > both --device _and_ OpenCL context (for what reason context - separrate >> > question but perhaps sometimes it's convenient indeed) you provide at >> > least partial backward compatibility. If one can provide backward >> > compatibility it should be done! >> > All this (BOINC) about using AVAILABLE user resources, already available >> > ones. Not about requesting users to upgrade OS, but new hardware and so >> > on. Backward compatibility should be keystone of BOINC concept. And all >> > these nor really needed "deprecations" will play badly with existing >> > userbase. >> > >_______________________________________________ >boinc_dev mailing list >[email protected] >http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >To unsubscribe, visit the above URL and >(near bottom of page) enter your email address. > > > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
