[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2014-09-05 Thread Kyle Auble
Just wanted to add here that I think I've found an even simpler
workaround. It looks like passing pci=bios as a kernel parameter
consistently allows the GPU to load, regardless of kernel version or
power source. I haven't tested it a whole lot, but so far it has worked
100%.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2014-07-04 Thread Kyle Auble
It's been a while, but I've found the time to dig much deeper into this
and familiarize myself with the kernel code some. Actually, I feel
comfortable with the idea of directly contacting the appropriate mailing
list now so this is more to keep the record up-to-date than a request
for more triage.

Anyways, after just walking through the kernel code, I first realized
that the first sign of the bug (the 30ms gap) was occurring somewhere
within the function pci_scan_child_bus (in drivers/pci/probe.c), between
when it invokes the function pci_scan_slot (also in drivers/pci/probe.c)
and the function pcibios_fixup_bus (in my case, under
arch/x86/pci/common.c)

From there, I began adding dev_info statements around function calls
that would be executed in between, then looked between whichever 2
messages the gap occurred between to further narrow down the problem.
After a few rounds of this, I found the delay consistently appearing
within the function pcie_aspm_configure_common_clock (in
drivers/pci/pcie/aspm.c) After a little research about what the PCIe
common clock is about, it actually explains several aspects of this bug.
Booting the computer from battery power would influence the power state
of the device, which is what ASPM is all about. And it turns out the
discrepancy of 24ms between a good boot and a bad boot is precisely the
length of time the PCIe standard defines as a timeout for link training.

Unfortunately, I don't know how, or even if, the two commits I found
earlier directly tie into this. It seems there's a really weird race
condition or resource fight going on. I'm not exactly sure how to fix
the problem clearly either because just adding the overhead of dev_info
statements to the function makes the bug go away (so I can technically
fix the bug, but that's just a total hack). The one other little cliue
I found was that the delay went away completely when I put dev_info
statements in every possible branch of the function's logic. When I only
added dev_info to the ifs corresponding to a problem though, a slight
delay appeared (bumping the total time in the function to around 10ms),
but still not enough for link training to timeout (so my GPU always
loaded).

I plan on mailing the list for the PCI subsystem of the kernel soon, but
I'm stumped about how exactly to proceed so if you have any debugging
suggestions, I'd be happy to hear them. Thanks again.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2014-01-03 Thread Kyle Auble
Hmm... so I've just finished a first set of tests with reverting that commit, 
and I definitely have results, though they aren't as cut-and-dry as I hoped. 
When I reverted the commit, there were merge conflicts in:
drivers/acpi/scan.c
drivers/dma/acpi-dma.c

Since I really have no clue how these files work, I used git mergetool
to try simple ways of resolving the conflicts. I tried completely
reverting both files to the older version and leaving them as they are
at the tip of the master branch. In both of those cases, the kernel
failed to build, with make throwing an error when it reached the
appropriate file, then completely stopping soon after with a [deb-pkg]
error. What's interesting is that when I kept acpi/scan.c in its up-to-
date form but entirely reverted /dma/acpi-dma.c, the kernel built
successfully.

Unfortunately, when I tried testing it, that kernel build froze during
boot, and when I logged in with a stable kernel to check the dmesg logs,
the bug was still there. I would need to actually take the plunge and
spend a while learning how the code works before I could resolve the
conflict more precisely. However, I noticed the build process created a
debug package this time; is there some debug setting that I could enable
that would shed light on anything?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2014-01-03 Thread Christopher M. Penalver
** Changed in: linux (Ubuntu)
   Status: Incomplete = Triaged

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2013-12-30 Thread Christopher M. Penalver
Kyle Auble, thank you for your commit bisection work. One thing that would be 
helpful is if we just revert the noted commit in the latest mainline and see if 
it continues to occur via a terminal, reboot, and testing the new kernel:
git config --global user.email y...@example.com  git config --global 
user.name Your Name  cd $HOME  git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git  cd linux  
git revert ee8209fd026b074bb8eb75bece516a338a281b1b  git add .  git commit 
 cp /boot/config-`uname -r` .config  yes '' | make oldconfig  make clean 
 make -j `getconf _NPROCESSORS_ONLN` deb-pkg LOCALVERSION=-customrevert  cd 
..  sudo dpkg -i *.deb  git fetch origin;git fetch origin master;git reset 
--hard FETCH_HEAD

** Changed in: linux (Ubuntu)
   Status: Triaged = Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2013-12-07 Thread Kyle Auble
So after another couple of months, I've managed to do more testing, and
I may have found something useful. First off, two fresh, stable versions
of the Ubuntu kernel have shown the bug: v3.2.0-56 (64-bit) and
v3.5.0-44 (32-bit). Also, the most recent package from the Ubuntu
mainline kernel PPA, v3.13.0-rc3 (built Dec. 6), showed the bug and
failed to boot. I confirmed the bug by looking at the old dmesg log
after rebooting into a working kernel.

On the positive side, after a little more free-time and thinking about
the problem, I can give you a commit that may be canceling out the
effect from Xiao Guangrong's earlier patch. Instead of using git-bisect,
I narrowed down the problem to a small range from previous tests, then
manually checked the merges in the mainline kernel's history. After
tracing the regression to a simple merge, I rebased the short side
branch leading to it onto the preceding, bug-free commit. I don't know
if this method would give false results, but I figured since the merge
itself involved no extra changes and the rebase didn't cause any
conflicts, it should be useful.

The patch where the bug reappeared for me was:
ee8209fd026b074bb8eb75bece516a338a281b1b by Andy Shevchenko

Hope this helps some, and let me know if there's anything else I could
try.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2013-12-07 Thread Christopher M. Penalver
** Tags removed: kernel-request-3.11.0-7.14
** Tags added: bot-stop-nagging

** Changed in: linux (Ubuntu)
   Status: Confirmed = Triaged

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2013-09-20 Thread Joseph Salisbury
Given the number of bugs that the Kernel Team receives during any
development cycle it is impossible for us to review them all. Therefore,
we occasionally resort to using automated bots to request further
testing. This is such a request.

We are approaching release and would like to confirm if this bug is
still present. Please test again with the latest development kernel and
indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the
following commands in a terminal window:

sudo apt-get update
sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to
Confirmed. If the bug no longer exists, change the bug status from
Incomplete to Fix Released.

Thank you for your help, we really do appreciate it.


** Changed in: linux (Ubuntu)
   Status: Confirmed = Incomplete

** Tags added: kernel-request-3.11.0-7.14

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2013-09-20 Thread Kyle Auble
I was a little confused about exactly which version of the kernel you
wanted to test, but it's a moot point because all of the ones I tried
had the bug still.

I'm still using Ubuntu 12.04 so sudo apt-get dist-upgrade just keeps me
on v3.5.0-40, which definitely has the bug. I also tested v3.11-rc1
(built 7/14) off of the Ubuntu Mainline PPA, v3.11-rc5 (it had the patch
by Rafael Wysocki I mentioned previously), and v3.12-rc1 (the latest
version). Every single one showed the bug, which I confirmed by checking
the dmesg logs after booting up with a stable kernel. Actually, none of
those three kernels even made it to the login screen.

It's a little unnerving that whatever changes fixed the bug around May
this year have been canceled out since then. Seeing the glass half-full
though, once I have some free time, I can do a standard bisection to see
where the fix was knocked out. That might give us a little more data to
work with.

** Changed in: linux (Ubuntu)
   Status: Incomplete = Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1009312] Re: 10de:0426 GPU loads unreliably, possible kernel timeout

2013-09-19 Thread Christopher M. Penalver
** Summary changed:

- GPU loads unreliably, possible kernel timeout
+ 10de:0426 GPU loads unreliably, possible kernel timeout

** Changed in: linux (Ubuntu)
   Status: Incomplete = Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1009312

Title:
  10de:0426 GPU loads unreliably, possible kernel timeout

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1009312/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs