On Apr 16, 2013, at 3:26 AM, Richard Haselgrove wrote: > Typo alert. "The ... file that the BOINC client writes to each slot" (second > paragraph) is of course init_data.xml, and I suspect the sample file contents > likewise. Calling it app_info.xml may set people's minds running down the > wrong path.
Right you are! Thank you for your eagle eye. It seems no matter how many times I proof read my writing, something always slips by. Cheers, --Charlie > From: Charlie Fenton <[email protected]> > To: Raistmer the Sorcerer <[email protected]> > Cc: [email protected] > Sent: Tuesday, 16 April 2013, 11:18 > Subject: Re: [boinc_dev] Accelerator type identification issue > > Hello Raistmer, > > I am sorry if I have not been clear enough in my responses. We have > _already_ done what you are requesting, but not the way you suggest. Doing > it the way you suggest would be incompatible with existing applications, > servers and older versions of BOINC and so would break older code. > Everything we have done is fully backward compatible. > > Under recent versions of BOINC, we have added more values to the app_info.xml > file that the BOINC client writes to each slot. Let us use as an example the > case we have been discussing in the SETI forum, where the user has 2 ATI > GPUs. The first GPU (ATI GPU 0) is capable of (and recognized by) CAL but > not OpenCL. The second ATI GPU (GPU 1) is recognized by and capable of both > CAL and OpenCL. > > Thus, my test version of BOINC _correctly_ reports that ATI GPU 1 is the only > OpenCL capable ATI GPU: > > CAL: ATI GPU 0: ATI Radeon HD 2600 (RV630) (CAL version 1.4.1734, 1024MB, > > 992MB available, 348 GFLOPS peak) > > CAL: ATI GPU 1: ATI Radeon HD 4600 series (R730) (CAL version 1.4.1734, > > 1024MB, 992MB available, 960 GFLOPS peak) > > OpenCL: AMD/ATI GPU 1: ATI Radeon HD 4600 series (R730) (driver version CAL > > 1.4.1734, device version OpenCL 1.0 AMD-APP (937.2), 1024MB, 992MB > > available, 960 GFLOPS peak) > > His app_info.xml file will now contain the following values specifying that > the application should use the first ATI GPU, which has ATI gpu_device_num 1 > and which also has ATI gpu_opencl_dev_index of 0: > <gpu_type>ATI</gpu_type> > <gpu_device_num>1</gpu_device_num> > <gpu_opencl_dev_index>0</gpu_opencl_dev_index> > > Older versions of init_data.xml don't have gpu_opencl_dev_index field. Still > older versions of init_data.xml don't have the gpu_device_num or the gpu_type > field. > > So those versions of BOINC passed the gpu_device_num to the application in > the command line. If the value of gpu_device_num was 1, they would pass > "--device 1" in the command line. > > For backward compatibility, BOINC _still_ passes the gpu_device_num to the > application in the command line; in our example case it _still_ passes > "--device 1" in the command line. To do anything different would break > compatibility with older applications! > > Around August 2011, we realized that passing the device number is not > sufficient if a user had both ATI and NVIDIA GPUs on the same computer, so we > created the API: > int boinc_get_opencl_ids(cl_device_id* device, cl_platform_id* platform); > which gets the GPU vendor and device number from the init_data.xml file. > > In January 2012, we discovered that on Macs, Apple's OpenCL does not support > some NVIDIA GPUs which CUDA does support, so we added the > gpu_opencl_dev_index field. This allowed the boinc_get_opencl_ids() API to > handle these correctly. OpenCL project applications did not need to worry > about this, as long as they were linked with a current version of > boinc_get_opencl_ids(). When used with older versions of BOINC which do not > provide the gpu_opencl_dev_index field, boinc_get_opencl_ids() reverts to > using only gpu_device_num to be as backward compatible as possible. > > But in December 2012, we realized that boinc_get_opencl_ids() was not > compatible with very old clients which did not provide the gpu_device_num or > the gpu_type field. So we deprecated the old boinc_get_opencl_ids() API and > added a new version which takes 5 arguments: > int boinc_get_opencl_ids(int argc, char** argv, int type, cl_device_id* > device, cl_platform_id* platform); > > Passing in the same argv and argc which were passed to the application allows > this function to use the value of --device from the command line for > compatibility with very old BOINC clients which did not have the > gpu_device_num field in the init_data.xml file. This gives us even better > backward compatibility than we had before. And allowing the project > application to pass in the type (NVIDIA, ATI or Intel) allows it to work with > older BOINC clients which did not have the gpu_type in the init_data.xml file. > > This newest boinc_get_opencl_ids() API has an added feature. If your OpenCL > application can run on any vendor's GPU, then you can create a plan class > telling BOINC that the vendor (gpu_type) does not matter. On any version of > BOINC new enough to put the gpu_type in the init_data.xml file, that one > application will run on whichever GPU is assigned by BOINC; you will no > longer need separate copies of the same OpenCL application for each GPU > vendor. > > I looked at the source code for your OpenCL and Brook anonymous platform SETI > Astropulse applications. I see they do not examine the --device argument > directly, but instead call the older version of boinc_get_opencl_ids() with 2 > arguments. I strongly recommend you update to the newer, 5 argument version > to have backward compatibility with even older versions of the BOINC client. > > A release of the BOINC client in the near future will handle most situations > where one or more ATI/AMD GPUs support CAL but not OpenCL. Both the > _existing_ 2-argument and the _existing_ 5-argument versions of > boinc_get_opencl_ids() will take advantage of the improved GPU detection > logic in this new BOINC client. > > I have one more suggestion. In your OpenCL anonymous platform SETI > Astropulse application, the application writes "BOINC assigns device %d" with > the value of BOINCs_device, which is the value of the gpu_opencl_dev_index. > This is confusing to users, who have seen the GPUs identified by their > physical device number gpu_device_num in the Event Log. It would be better > if the application would display the physical device number, and use the > gpu_opencl_dev_index only internally. You can get the value of > gpu_device_num either from the --device command-line argument, or from the > gpu_device_num field of the init_data struct. > > Cheers, > --Charlie > > On Apr 16, 2013, at 1:12 AM, Raistmer the Sorcerer wrote: > > >The reason BOINC _must_ use the same index for the same physical GPU is to > > >prevent assigning the same physical GPU to more than one task at a time. > > >This is the number reported by --device, and is the same as the index of > > >CAL or CUDA capable GPUs. > > > > BOINC - yes (inside scheduler), but should BOINc report that physical > > number to scientific apps? No. It should not! > > For what reason --device N should mean PHYSICAL DEVICE ? > > What I propose is to set --device N meaning as next: index to device array, > > recived by that enumeration API that app of corresponding type uses. > > That is, if app is CAL app than --device N means index into array of CAL > > devices. > > If it's CUDA app then --device is index to array of OpenCL devices (of > > correspnding type NV, ATi, or intel_gpu). > > And so on. > > Look! > > Currently we can have NV GPU + ATI GPU in the same OS. So, 2 physical > > devices. > > But each of NV and ATi apps will recive --device 0 ! As it should be if > > device will be defined as I propose, not as just "physical device". There > > are 2 physical devices of different types. > > In case of CAL and OpenCL there are too 2 different physocal devices. 2 > > devices of CAL type and 1 device of OpenCL type. BOINC (and ONLY BOINC > > CLIENT) should know that device X from CAL list is same as device Y from > > OpenCL list. > > > > > > As of BOINC version 7.0.12, we have added a second index, which is the > > index of only openCL-capable GPUs. In the above example, this would have > > the value 0 for the HD 4600, and this value provides the API-specific index > > Raistmer requests. > > > > The reasons that we have deprecated the use of --device and now require GPU > > applications to instead call boinc_get_opencl_ids(int argc, char** argv, > > int type, cl_device_id* device, cl_platform_id* platform). It also > > optionally allows an application to offer a plan class allowing it to run > > on all OpenCL capable GPUs, not just from one vendor. > > > > The reason for the change is that this newer API deals automatically with > > the possible difference between the CAL or CUDA device index and the OpenCL > > device index. As the comments in the source file explain: > > // A few complicating factors: > > // Windows & Linux have a separate OpenCL platform for each vendor > > // (NVIDIA, AMD, Intel). > > // Mac has only one platform (Apple) which reports GPUs from all vendors. > > // > > // In all systems, opencl_device_indexes start at 0 for each platform > > // and device_nums start at 0 for each vendor. > > // > > // On Macs, OpenCL does not always recognize all GPU models detected by > > // CUDA, so a device_num may not correspond to its opencl_device_index > > // even if all GPUs are from NVIDIA. > > > > I will add to this that we have recently learned that AMD's OpenCL does not > > always recognize all GPU models detected by CAL, so a device_num may not > > correspond to its opencl_device_index even if all GPUs are from ATI/AMD. > > > > NOTE: The new boinc_get_opencl_ids() API is 100% backward compatible with > > older versions of the BOINC client. From the source file's comments: > > > > // This version is compatible with older clients. > > // Usage: > > // Pass the argc and argv received from the BOINC client > > // type: may be PROC_TYPE_NVIDIA_GPU, PROC_TYPE_AMD_GPU or > > PROC_TYPE_INTEL_GPU > > // (it may also be 0, but then it will fail on older clients.) > > // > > // The argc, argv and type arguments are ignored for 7.0.12 or later > > clients. > > // > > // returns > > // - 0 if success > > // - ERR_FOPEN if init_data.xml missing > > // - ERR_XML_PARSE if can't parse init_data.xml > > // - CL_INVALID_DEVICE_TYPE if unable to get gpu_type information > > // - ERR_NOT_FOUND if unable to get opencl_device_index or gpu device_num > > // - an OpenCL error number if OpenCL error > > > > Finally, we have added two new prototype plan classes: opencl_nvidia_101 > > and opencl_ati_101 for app versions that run on NVIDIA or ATI GPUs using > > OpenCL 1.1, using at most 256MB of GPU RAM. You can modify > > sched_customize.cpp to change these parameters or add your own plan > > classes, such as for OpenCL 1.0 or 1.2. These plan classes are not backward > > compatible and require BOINC 7.0.x. > > > > Information about all of the above can be found at > > <http://boinc.berkeley.edu/trac/wiki/OpenclApps>. > > > > I hope this answers your questions. > > > > Cheers, > > --Charlie > > > > On Apr 15, 2013, at 6:48 AM, Raistmer the Sorcerer wrote: > > > Regarding deprecation of --device N option: > > > can anyone provide description for what reason it was done? > > > > > > Each API contains own enumeration. > > > Each enumeration (in particular device class) starts from zero (0). > > > What prevents BOINC to report --device N to app correctly if BOINC knows > > > for what accelerator class designed ? > > > In view of recent CAL/OpenCL issue (or in view OSX CUDA OpenCL issue, no > > > matter): > > > --device N for CAL should be 1 and 0 (2 CAL enabled devices installed); > > > --device N for OpenCL should be only 0 (1 OpenCL capable device > > > installed). BOINC keeps track what device is what physical device, app > > > just need device number in own API enumeration scheme. > > > For what reason (for example) my OpenCL app should know that there are > > > another, non-OpenCL device in system ? It should not. Hence, no "device > > > 1", but "device 0". It doesn't kere about keyboard or mouse, it should > > > not care about CAL GPU too. It's BOINC mission not to allocate same > > > physical device both as CAL and OpenCL in the same time. > > > Currently app recives OpenCL context handler. Ok, no probs with that. But > > > (!) ensure back compatibility! such OpenCL context should contain same > > > device as OpenCL enumeration API would provide if --device contains > > > offset in device list. What particular issues do you see with this? But > > > providing both --device _and_ OpenCL context (for what reason context - > > > separrate question but perhaps sometimes it's convenient indeed) you > > > provide at least partial backward compatibility. If one can provide > > > backward compatibility it should be done! > > > All this (BOINC) about using AVAILABLE user resources, already available > > > ones. Not about requesting users to upgrade OS, but new hardware and so > > > on. Backward compatibility should be keystone of BOINC concept. And all > > > these nor really needed "deprecations" will play badly with existing > > > userbase. > > > > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
