On 28 November 2015 at 15:10, Francisco Jerez <curroje...@riseup.net> wrote: > Emil Velikov <emil.l.veli...@gmail.com> writes: > >> Hi Francisco, >> >> In all due respect, I am surprised with the volume of justification >> you put into some changes. >> > And apparently it wasn't enough? :P > >> On 28 November 2015 at 11:52, Francisco Jerez <curroje...@riseup.net> wrote: >>> Francisco Jerez <curroje...@riseup.net> writes: >>> >>>> Tom Stellard <thomas.stell...@amd.com> writes: >>>> >>>>> When probing for devices, clover will call pipe_loader_probe() twice. >>>>> The first time to retrieve the number of devices, and then second time >>>>> to retrieve the device structures. >>>>> >>>>> We currently assume that the return value of both calls will be the >>>>> same, but this will not be the case if a device happens to disappear >>>>> between the two calls. >>>>> >>>> >>>> Assuming that the two sequential calls will give the same result is bad, >>>> but the likelihood of a device disappearing between the two calls is so >>>> small it's unlikely to happen more than once in a million years. If >> Err... have to love this estimate :-P With some recent chances that >> probability increases ever so slightly, as we push the "can we find >> the module, does it have the entry point" from .create_screen and >> .configure further up to probe screen. >> > > I was referring to the probability of a device actually disappearing > From the system between the two calls. > >>>> this problem happens deterministically to you there's almost certainly a >>>> more serious bug in pipe-loader that will cause it to return an >>>> incorrect device count during the first call. >>>> >> Care to give an example. I'm not sure where/how this is possible. >> > The immediately following paragraph explained how and where it's > possible: pipe_loader_sw_probe() returns one regardless of whether the > software device is actually available. > That's why we have the "two stage" approach 1) Query the total count, allocate memory and 2) populate the memory with information.
The alternative is to build a separate pipe-loader for each target, which ... doesn't sound optimal. Neither is forcing every target to have be with the same set of drivers. >>>> Indeed, pipe_loader_sw_probe() seems to return 1 unconditionally if ndev >>>> is zero, so it will give a different device count depending on ndev if >>>> pipe_loader_sw_probe_null() happens to fail. Seems like a bug we don't >>>> want to simply paper over. >>>> >> Can you elaborate what exactly the problem here ? If we ask "how many >> we have" sure we can provide that [fixed] number. On the other hand, >> if we want more information about it (them), there is no guarantee >> that we'll succeed. > > The return value of pipe_loader_probe() is supposed to be the number of > devices actually available on the system, it doesn't depend on the devs > pointer or ndev parameter -- See the doxygen comment in pipe_loader.h > for the expected behaviour and pipe_loader_drm_probe() for a correct > implementation. > True, the documentation does say that. Yet the EGL Devices approach is what I perceive as better (clearer, more intuitive) approach - if your storage pointer is null return the total count. Otherwise upon information retrieval cap up-to the specified device count. Admittedly the EGL spec came long after the pipe-loader. >> >>>> Anyway I'm not saying that clover's assumption shouldn't be fixed too -- >>>> See comment below. >>>> >>>>> This patch removes this assumption and checks the return value of the >>>>> second pipe_loader_probe() call to ensure it does not try to initialize >>>>> devices that no longer exits. >>>>> >>>>> CC: <mesa-sta...@lists.freedesktop.org> >>>>> --- >>>>> src/gallium/state_trackers/clover/core/platform.cpp | 5 +++-- >>>>> 1 file changed, 3 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/src/gallium/state_trackers/clover/core/platform.cpp >>>>> b/src/gallium/state_trackers/clover/core/platform.cpp >>>>> index 328b71c..689d692 100644 >>>>> --- a/src/gallium/state_trackers/clover/core/platform.cpp >>>>> +++ b/src/gallium/state_trackers/clover/core/platform.cpp >>>>> @@ -28,9 +28,10 @@ platform::platform() : adaptor_range(evals(), devs) { >>>>> int n = pipe_loader_probe(NULL, 0); >>>>> std::vector<pipe_loader_device *> ldevs(n); >>>>> >>>>> - pipe_loader_probe(&ldevs.front(), n); >>>>> + n = pipe_loader_probe(&ldevs.front(), n); >>>>> >>>>> - for (pipe_loader_device *ldev : ldevs) { >>>>> + for (int i = 0; i < n; ++i) { >>>>> + pipe_loader_device *ldev = ldevs[i]; >>>> >>>> There are many reasons to prefer range loops to induction variables >>>> (including iterators), and there's no need to change it here: Because >>>> ldevs is zero-initialized you can just wrap the push_back(create()) call >>>> around an 'if (ldev)' to make this a one-liner. >>>> >>> Let me be more explicit about what could go wrong with this change in >>> its current form: The return value of pipe_loader_probe() is the number >>> of devices available on the system, so basically the same race condition >>> is still possible if 'n' increases from the first call to the second, >>> and in that case you'll end up reading uninitialized memory past the end >>> of the array -- Worse than a null dereference. Range loops make this >>> kind of situation impossible. >>> >> If I understood you correctly, you're saying that on the second call >> can return count greater than the initial one. >> >> Errm... this looks like a bug in the pipe-loader. I don't think it's >> normal to claim that X devices are available only to return >> information for fewer devices. OK, calling it a bug might not be >> politically correct, although it is a misleading design decision, at >> the very least. >> > What I was saying is that if the rationale of this patch is correct and > clover may not make any assumptions about the number of available > devices remaining the same between calls to pipe_loader_probe(), this > patch doesn't fix the problem at all, and in fact introduces a worse > kind of bug (read from uninitialized memory). The alternative I > suggested would be simpler and more robust, so it's worth changing > regardless of whether or not it's actually likely that the device count > will increase or decrease between calls. > So the difference boils down to what is/should be the behaviour of pipe_loader_probe(). I take it that you're adamant that pipe-loader's approach is superior, or there is a sliver of hope that one can sway your view ? Thanks Emil _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev