Update... So the kernel oops is happening because the driver is trying to tear down state for a V4L2 radio device - except there was no radio device configured so the tear-down ended up dereferencing through a null pointer. Boom.
I backtracked through the code to figure out "why now", and I could not find a reason. From what I can tell this bug has likely been there for about 11 years. The code which bypasses setup of the radio device takes that path if there's no radio support configured for the hardware - which is sadly the case for the HVR-1950 - and git blame shows that area of code last modified in 2008. (That makes sense because that's about when the HVR-1950 was added.) Best I can figure that some other happenstance had to have prevented the kernel from blowing up on this pointer. FWIW, it's actually trying to dereference an offset from null, but the distance to the offset is still small enough that it should fit in the first virtual page address and thus be detected. Anyway, I made a change to the two places in the code where this matters, basically don't touch the radio data structure if it isn't there, and now the kernel oops is gone. This also explains why I could not reproduce the problem before - because the different device I was trying has a working radio in it that can be operated by the pvrusb2 driver. Thus this condition did not arise. There's still other strangeness to figure out, namely the sysfs teardown problem and implementing *something* to keep a userspace I2C client from jamming up the pvrusb2 driver. But this is progress. Obviously I will get this pushed. I can send you a source patch if you'd like to try rebuilding the module on your end. Since we're not running identical kernels I can't just send you the binary. -Mike On Sun, 13 Oct 2019, Diego Rivera wrote: > Mike, > As a developer myself, I can fully understand the importance of this > discovery!! I have no doubt > that the stack trace differences you're observing are due to offset shifts > from the added debug > instructions (they have to be stored somewhere, after all). This is > encouraging news!! Thanks for > not giving up! > As always: let me know if there's any way I can help the process! > Cheers! > > On Sun, 2019-10-13 at 18:15 -0500, Mike Isely wrote: > > Diego: > > I was *finally* able to reproduce the precise kernel oops you got. I had > > to load the exact same > > Ubuntu kernel you are using and the test had to run specifically against an > > HVR-1950. The older > > (simpler) device I had been trying won't fail. But with that said, I got > > your exact call trace. > > Now that I see the signature, I immediately tested again using a 5.2.13 > > kernel.org vanilla kernel > > that is larded full of printk() statements in the driver, again on an > > HVR-1950. And it blew > > chunks again. The signature wasn't precisely the same (stack trace is > > slightly different) but > > it's close enough that I believe it's the same root cause. > > Now the real digging starts. > > Note: This is ignoring the sysfs tear-down collision I had mentioned > > earlier (which, interestingly > > didn't happen this time, probably because this oops stopped the tear-down > > before it got that > > far). This is also with the external userspace I2C access disabled so I > > can keep that source of > > log noise out of the way, for now. So there's really 3 issues here. > > Trying to focus on the one > > that is burning you specifically. > > If it turns out that what I'm seeing in the 5.2.13 kernel is actually > > different, well then that > > just means there are 4 problems :-( But right now I'm betting it's the same > > so that's the avenue > > I'm going to chase. If I run aground, then I'm going to backtrack to that > > specific Ubuntu kernel > > and rebuild it with all my debug code added and other config tweaks to help > > with chasing the > > problem. > > -Mike > -- Mike Isely isely @ isely (dot) net PGP: 03 54 43 4D 75 E5 CC 92 71 16 01 E2 B5 F5 C1 E8 _______________________________________________ pvrusb2 mailing list [email protected] http://www.isely.net/cgi-bin/mailman/listinfo/pvrusb2
