On Mon, Feb 14, 2011 at 6:03 PM, Peter Hutterer <peter.hutte...@who-t.net>wrote:
> On Fri, Feb 11, 2011 at 04:54:48PM -0800, Ping Cheng wrote:
> > Testing result shows that the driver may start processing data
> > before all the tools associated with the same logical port are
> > enabled or after one of the tools is removed during hotplugging
> > process. This causes unenabled tools to be accessed sometime,
> > which leads to a driver crash.
> >
> > Added a counter (wcmNumTools) to keep track of the enabled the tools
> > so we don't let the driver read the data if not all the tools are
> > enabled.
> >
> > The fact that all tools associated with the same logical port are
> > initialized first before they are enabled one by one is the foundation
> > that we could use a counter to cover both static defined (in xorg.conf)
> > and hotplugged/dynamic allocated devices with the code.
> >
> > Reviewed-by: Peter Hutterer <peter.hutte...@who-t.net>
> > Signed-off-by: Ping Cheng <pingli...@gmail.com>
> >
> > ---
> > Peter, I added your "if (!pInfo->dev || !pInfo->dev->enabled)" check as
> > a safeguard for the crash issue. Since it was intrduced to deal with
> > the same issue, I merged it in this patch. I also added your
> > Reviewed-by as you made privately yesterday. Let me know if you do not
> > agree.
>
> ok, I've just tested this patch and it seems to work. Why, I can't say
> though because when I dug down into it my head started dizzying.
> I can't give this my reviewed-by though (and I don't think I did so in
> private emails, but anyway). it doesn't crash the server, but that seems
> more by accident.
>
I must have misunderstood the comments.
When I shared the patch with you off line last week, you said if I move the
"if (common->fd_refs != common->wcmNumTools)" down above the last line of
case DEVICE_ON, I would be fine. I took that as you agreed with the patch. I
am sorry.
> > ---
> > src/wcmCommon.c | 5 ++---
> > src/xf86Wacom.c | 19 +++++++++++++++----
> > src/xf86WacomDefs.h | 5 +++++
> > 3 files changed, 22 insertions(+), 7 deletions(-)
> >
> > diff --git a/src/wcmCommon.c b/src/wcmCommon.c
> > index dcf792c..8c3afd8 100644
> > --- a/src/wcmCommon.c
> > +++ b/src/wcmCommon.c
> > @@ -1226,10 +1226,9 @@ static void commonDispatchDevice(WacomCommonPtr
> common, unsigned int channel,
> >
> > /* Tool on the tablet when driver starts. This sometime causes
> > * access errors to the device */
> > - if (!miPointerGetScreen(pDev->dev))
> > + if (!pDev || !pDev->dev || !pDev->dev->enabled)
> > {
> > - xf86Msg(X_ERROR, "wcmEvent: Wacom driver can not get
> Current Screen ID\n");
> > - xf86Msg(X_ERROR, "Please remove Wacom tool from the tablet
> and bring it back again.\n");
> > + xf86Msg(X_ERROR, "wcmEvent: tool not initialized yet.
> Skipping event. \n");
> > return;
> > }
> >
> > diff --git a/src/xf86Wacom.c b/src/xf86Wacom.c
> > index e8c4710..03eb8df 100644
> > --- a/src/xf86Wacom.c
> > +++ b/src/xf86Wacom.c
> > @@ -740,9 +740,17 @@ static int wcmReady(InputInfoPtr pInfo)
> >
> > static void wcmDevReadInput(InputInfoPtr pInfo)
> > {
> > + WacomDevicePtr priv = (WacomDevicePtr)pInfo->private;
> > + WacomCommonPtr common = priv->common;
> > int loop=0;
> > #define MAX_READ_LOOPS 10
> >
> > + /* don't read the data before all tools are enabled, or at
> > + * least one tool has been removed on the same logical port.
> > + */
> > + if (common->fd_refs != common->wcmNumTools)
> > + return;
> > +
> > /* move data until we exhaust the device */
> > for (loop=0; loop < MAX_READ_LOOPS; ++loop)
> > {
> > @@ -757,8 +765,6 @@ static void wcmDevReadInput(InputInfoPtr pInfo)
> > /* report how well we're doing */
> > if (loop > 0)
> > {
> > - WacomDevicePtr priv = (WacomDevicePtr)pInfo->private;
> > -
> > if (loop >= MAX_READ_LOOPS)
> > DBG(1, priv, "Can't keep up!!!\n");
> > else
> > @@ -880,9 +886,9 @@ static void wcmDevClose(InputInfoPtr pInfo)
> > static int wcmDevProc(DeviceIntPtr pWcm, int what)
> > {
> > InputInfoPtr pInfo = (InputInfoPtr)pWcm->public.devicePrivate;
> > -#ifdef DEBUG
> > WacomDevicePtr priv = (WacomDevicePtr)pInfo->private;
> > -#endif
> > + WacomCommonPtr common = priv->common;
> > +
> > Status rc = !Success;
> >
> > DBG(2, priv, "BEGIN dev=%p priv=%p "
> > @@ -900,9 +906,14 @@ static int wcmDevProc(DeviceIntPtr pWcm, int what)
> > case DEVICE_INIT:
> > if (!wcmDevInit(pWcm))
> > goto out;
> > +
> > + /* All tools on the same port are initialized one
> by one
> > + * here before any of them are enabled */
> > + common->wcmNumTools++;
>
> As I've said in other emails, this is not the case for the hotplugging case
> which is the most common case we face on modern desktop distributions.
>
Is there a specific way for me to tell if I am testing on a "modern desktop
distributions" or not, such as the version of X server, XInput, or
something? My testing was based on Fedora 14, up to date. It showed
"DEVICE_INIT for all devices, then DEVICE_ON for all devices." for
hot-plugging. Which distro should I use to test the second path?
Don't get me wrong, please. I am not question you. I need to understand it
so I can do it right.
> quote:
> "init in the config is quite different to init during hotplugging.
> the server will call, PreInit for all devices, DEVICE_INIT for all devices,
> then DEVICE_ON for all devices.
>
> For hotplugging, it's PreInit first device, PreInit + DEVICE_INIT +
> DEVICE_ON for all dependent devices in order, then DEVICE_INIT and
> DEVICE_ON
> for the first device."
>
> so with this patch, whenever DEVICE_ENABLE is called, you have
> common->wcmNumTools being equal to common->fd_refs. which means it'll go
> happily past the condition in wcmDevReadInput if a device sends events
> before all devices are enabled.
>
I need to know which system I can test with before I comment more. My
testing result doesn't tell me the above, somehow.
> the only reason why it doesn't crash (though the first hunk should avoid
> the
> crash anyway)
I do not know exactly why the patch works. As I've told you, I have been
working on this issue with a customer for sometime. We come to this point
based on intensive testing result, not theory. So, I can not say I am sure
it fixes the issue at its root cause.
The testing result showed the following:
without the "first hunk", the system crashes very often: once about a
hundred hot-plugging;
with the "first hunk", the system is still crashing. But a lot less: once
about 2k hot-plugging;
with the whole patch, the system has survived about 40k hot-plugging. Since
the customer feels that is good enough, they stopped the testing and called
it accepted.
> is because a memset in usbDispatchEvents resets for protocol
> level 5 and proximity resets pChannel->nSamples and any event is discarded
> because of the "drop the first 2 usb events" condition. so
> commonDispatchDevice is just never called for any event during this init
> period because nSamples is just reset after every event.
> At that point I stopped debugging because whatever the right behaviour is,
> I'm pretty sure this isn't it.
>
> > break;
> >
> > case DEVICE_ON:
> > + /* common->fd_refs is updated for each tool in
> wcmDevOpen */
>
> we need to avoid this sort of comment. It's confusing. there is no mention
> of fd_refs anywhere in wcmDevProc and it doesn't matter anyway - that's
> what
> wcmDevOpen() and wcmDevInit() are supposed to handle.
>
We can discuss how to make a comment more clear. I am willing to learn. For
this case, my intention was:
To point out that common->fd_refs is updated in wcmDevOpen so readers could
link common->fd_refs to common->wcmNumTools in the DEVICE_INIT above when
they see " if (common->fd_refs != common->wcmNumTools)".
If this is still unclear to you, I'll make more attempts until it makes
sense to you so you can reword the comments to make it clear.
> either way - the approach of counting the devices and only enabling when
> all
> are there is correct but this implementation isnt. I'll merge the first
> hunk
> for 0.10.11 because it at least stops the crashes, then we can figure out
> what to do after.
>
No problem. I just thought we could get a broader testing base if we could
have the patch in 0.10.11.
Ping
------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Linuxwacom-devel mailing list
Linuxwacom-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxwacom-devel