Re: Re: Boot fails after stretch upgrade to linux 4.9.0

2017-08-13 Thread Francesco Montanari
As an alternative solution to disabling nvidia optimus from BIOS, 
another off-list exchange suggested to force disabling nouveau runtime 
power-managment.


Create a file /etc/modprobe.d/nouveau.conf and add the following to it:

options nouveau runpm=0

Then reboot. Works for me (kernel: linux 4.9.0-3-amd64).

Thanks,
Francesco



Re: Boot fails after stretch upgrade to linux 4.9.0

2017-02-13 Thread francesco . montanari

On 2017-02-04 19:26, Francesco Montanari wrote:

To sum up:
- using linux 4.8.0. I notice no problem.
- When I first booted linux 4.9.0, I got a TPM error message that I 
never

  had before. Disabling the security chip from BIOS solved it.
- The system does not boot as it gets stucked after loading some
  service (e.g., network manager).
- It boots in recovery mode, but freezes when passing the lspci
  command. It also freezes if I pass "systemctl reboot"
- Logs show error messages related to acpi, drivers, remount error of
  EXT4 partition (fsck did not find problems), and segfault related to
  gnome-session-f (https://is.gd/fMGaPr).



Comparing various logs of 4.8.0 vs 4.9.0 I didn't spot significant 
differences. As a test, I also tried to install the proprietary nvidia 
drivers and CPU microcode update packages, but things didn't change.


Thanks to an off-list exchange I managed to boot 4.9.0 in normal mode by 
deactivating the discrete graphic card (Nvidia Optimus) from BIOS. The 
integrated card is good enough for me anyway.


Best,
Francesco



Re: Boot fails after stretch upgrade to linux 4.9.0

2017-02-04 Thread Francesco Montanari
Hi,

I may have not copied the correct error message when the system
freezes after passing lspci in recovery mode (unless
the message changed). I report below the corrected message.

To sum up:
- using linux 4.8.0. I notice no problem.
- When I first booted linux 4.9.0, I got a TPM error message that I never
  had before. Disabling the security chip from BIOS solved it.
- The system does not boot as it gets stucked after loading some
  service (e.g., network manager).
- It boots in recovery mode, but freezes when passing the lspci
  command. It also freezes if I pass "systemctl reboot"
- Logs show error messages related to acpi, drivers, remount error of
  EXT4 partition (fsck did not find problems), and segfault related to
  gnome-session-f (https://is.gd/fMGaPr).

Do you have any hint to identify and understand better the origin of
the issue (please let me know if you need more info)?




Here is the message before the system gets stucked:
# lspci -nn | grep VGA
[   111.290008] thinkpad_acpi: EC reports that Thermal Table has changed
[   111.290185] nouveau :01:00.0: DRM: resuming kernel object
tree..
[   111.477119] nouveau :01:00.0: DRM: resuming client object trees...
[   111.478199] nouveau :01:00.0: DRM: resuming display...
[   111.479289] nouveau :01:00.0: DRM: resuming console...




Thanks,
Francesco


On Sat, Jan 28, 2017 at 02:35:24PM +0200, Francesco Montanari wrote:
> Hi,
>
> With a recent Stretch update, linux 4.9.0-1-amd64 was installed. I got:
>   A TPM error (6) occurred attempting to read a pcr value
>
> Disabling the security chip from BIOS solved the error. Still, 4.9.0
> does not boot. It manages to start some service (e.g., network
> manager), but then it gets stuck before displaying the login screen.
>
> I managed to boot in recovery mode. I collect here some grep of errors
> and warnings:
>
> - From dmesg:
>   [ 0.773426] acpi PNP0A08:00: _OSC failed (AE_SUPPORT); disabling ASPM
>
> - From journalctl -xb:
>   Jan 28 10:05:08 debian-francesco kernel: acpi PNP0A08:00: _OSC failed 
> (AE_SUPPORT); disabling ASPM
>   -- The process /bin/plymouth could not be executed and failed.
>   Jan 28 10:05:17 debian-francesco bluetoothd[512]: Sap driver initialization 
> failed.
>   Jan 28 10:05:08 debian-francesco kernel: EXT4-fs (sda3): re-mounted. Opts: 
> errors=remount-ro
>   -- The error number returned by this process is 2.
>
> - From syslog:
>   https://is.gd/fMGaPr
>
> When rebooting from recovery mode (i.e., executing 'systemctl
> reboot'), it also gets stuck.
>
> Finally, If I pass the following command in recovery mode
>   # lspci -nn | grep VGA
> it gets stuck (ctrl-c/d/z have no effect) right after displaying this
> message:
> [   19.948099] nouveau :01:00.0: DRM: suspending console...
> [   19.951782] nouveau :01:00.0: DRM: suspending display...
> [   19.955032] nouveau :01:00.0: DRM: evicting buffers...
> [   19.958215] nouveau :01:00.0: DRM: waiting for kernel channels to 
> go idle...
> [   19.961380] nouveau :01:00.0: DRM: suspending client object 
> trees...
> [   19.964738] nouveau :01:00.0: DRM: suspending kernel object tree...
>
> I successfully booted with the old 4.8.0 kernel (normal mode). In that
> case:
>   $ lspci -nn | grep VGA
>   00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation 
> Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)
>   01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF119M [Quadro 
> NVS 4200M] [10de:1057] (rev a1)
>
>
> Not sure if it is related to some configuration I may have modified in
> the past, to graphics drivers or, e.g., to old BIOS? Maybe something
> else?
>
> I have a Thinkpad T420.
>
> Thanks,
> Francesco
>



Boot fails after stretch upgrade to linux 4.9.0

2017-01-28 Thread Francesco Montanari
Hi,

With a recent Stretch update, linux 4.9.0-1-amd64 was installed. I got:
  A TPM error (6) occurred attempting to read a pcr value

Disabling the security chip from BIOS solved the error. Still, 4.9.0
does not boot. It manages to start some service (e.g., network
manager), but then it gets stuck before displaying the login screen.

I managed to boot in recovery mode. I collect here some grep of errors
and warnings:

- From dmesg:
  [ 0.773426] acpi PNP0A08:00: _OSC failed (AE_SUPPORT); disabling ASPM

- From journalctl -xb:
  Jan 28 10:05:08 debian-francesco kernel: acpi PNP0A08:00: _OSC failed 
(AE_SUPPORT); disabling ASPM
  -- The process /bin/plymouth could not be executed and failed.
  Jan 28 10:05:17 debian-francesco bluetoothd[512]: Sap driver initialization 
failed.
  Jan 28 10:05:08 debian-francesco kernel: EXT4-fs (sda3): re-mounted. Opts: 
errors=remount-ro
  -- The error number returned by this process is 2.

- From syslog:
  https://is.gd/fMGaPr

When rebooting from recovery mode (i.e., executing 'systemctl
reboot'), it also gets stuck.

Finally, If I pass the following command in recovery mode
  # lspci -nn | grep VGA
it gets stuck (ctrl-c/d/z have no effect) right after displaying this
message:
[   19.948099] nouveau :01:00.0: DRM: suspending console...
[   19.951782] nouveau :01:00.0: DRM: suspending display...
[   19.955032] nouveau :01:00.0: DRM: evicting buffers...
[   19.958215] nouveau :01:00.0: DRM: waiting for kernel channels to go 
idle...
[   19.961380] nouveau :01:00.0: DRM: suspending client object trees...
[   19.964738] nouveau :01:00.0: DRM: suspending kernel object tree...

I successfully booted with the old 4.8.0 kernel (normal mode). In that
case:
  $ lspci -nn | grep VGA
  00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation 
Core Processor Family Integrated Graphics Controller [8086:0126] (rev 09)
  01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF119M [Quadro 
NVS 4200M] [10de:1057] (rev a1)


Not sure if it is related to some configuration I may have modified in
the past, to graphics drivers or, e.g., to old BIOS? Maybe something
else?

I have a Thinkpad T420.

Thanks,
Francesco



Re: Adding condition on package version pinning (example case: linphone)

2016-07-03 Thread Francesco Montanari
Sorry, just found the solution by specifying the problematic version and
giving negative priority (instead of forcing downgrade).

Package: linphone*
Version: 3.6.1-2.6
Pin: release n=stretch
Pin-Priority: -10


On Sun, Jul 3, 2016 at 12:35 PM, Francesco Montanari <franz1010...@gmail.com
> wrote:

> Hi,
>
> I am on Stretch and the default linphone version is unusable for me (I got
> reported bugs #743494). However Jessie's version works, so I pinned it in
> /etc/apt/preferences:
>
> Package: linphone*
> Pin: release n=jessie
> Pin-Priority: 1001
>
> However, the version is quite old (e.g. lacks of ZRTP support). Is it
> possible to put a condition such that it should prioritize version
> 3.6.1-2.4 (jessie) over 3.6.1-2.6 (current on stretch), but upgrade if >=
> 3.7 becomes available on stretch repositories?
>
> Cheers,
> Francesco
>


Adding condition on package version pinning (example case: linphone)

2016-07-03 Thread Francesco Montanari
Hi,

I am on Stretch and the default linphone version is unusable for me (I got
reported bugs #743494). However Jessie's version works, so I pinned it in
/etc/apt/preferences:

Package: linphone*
Pin: release n=jessie
Pin-Priority: 1001

However, the version is quite old (e.g. lacks of ZRTP support). Is it
possible to put a condition such that it should prioritize version
3.6.1-2.4 (jessie) over 3.6.1-2.6 (current on stretch), but upgrade if >=
3.7 becomes available on stretch repositories?

Cheers,
Francesco


Re: ThinkPad fan

2016-06-30 Thread Francesco Montanari
On 06/30/2016 10:58 AM, Jörg-Volker Peetz wrote:
> Does your ThinkPad contain hybrid graphic adapters, an NVidia GPU?
>
> Regards,
> jvp.
>
>

Please find the info below (integrated Intel + discrete Nvidia). I
replaced the thermal paste both on the CPU and the discrete graphic card
(just followed the hardware maintenance manual). One thing that I may
have not done properly is the quantity of paste, that the manual
suggested to be precisely 0.2g. I proceeded with an educated guess
rather than measuring it...

Anyway, now most of the time the temperature stays below 60C also when
video calling (before the maintenance it reached easily 80C with Jitsi),
and goes above it (70C-80C) only after few minutes of 100% CPU usage
(when before it reached 96C). Seems reasonable?

Regards,
Francesco

# lspci -vnn | grep VGA -A 12
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd
Generation Core Processor Family Integrated Graphics Controller
[8086:0126] (rev 09) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device [17aa:21d0]
Flags: bus master, fast devsel, latency 0, IRQ 45
Memory at f140 (64-bit, non-prefetchable) [size=4M]
Memory at e000 (64-bit, prefetchable) [size=256M]
I/O ports at 6000 [size=64]
Expansion ROM at  [disabled]
Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [a4] PCI Advanced Features
Kernel driver in use: i915

00:16.0 Communication controller [0780]: Intel Corporation 6 Series/C200
Series Chipset Family MEI Controller #1 [8086:1c3a] (rev 04)
--
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF119M
[Quadro NVS 4200M] [10de:1057] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device [17aa:21d0]
Flags: bus master, fast devsel, latency 0, IRQ 47
Memory at f000 (32-bit, non-prefetchable) [size=16M]
Memory at c000 (64-bit, prefetchable) [size=256M]
Memory at d000 (64-bit, prefetchable) [size=32M]
I/O ports at 5000 [size=128]
Expansion ROM at f100 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [b4] Vendor Specific Information: Len=14 
Capabilities: [100] Virtual Channel



Re: ThinkPad fan

2016-06-30 Thread Francesco Montanari


On 06/30/2016 03:13 AM, Gener Badenas wrote:
> 
> Is it just hardware issue or do you think the OS play some role on
> heating issue?  Can using a different OS make temp better?
> One more suggestion is to polish the surface well before attaching the
> heatsink.
>  
>

It was mainly a hardware problem. Cleaning the fan and especially
replacing the thermal grease helped to reduce the max temp from 96C to
less than 80C.

Enabling the package 'thinkfan' with default configuration also helps
reducing further ~5C, and to slow down the rise in temperature when
above 60C.

Regards,
Francesco



Re: ThinkPad fan

2016-06-28 Thread Francesco Montanari
Hi,

On Sat, Jun 18, 2016 at 10:07 PM, Henrique de Moraes Holschuh <
h...@debian.org> wrote:
>
> On Sat, 18 Jun 2016, Francesco Montanari wrote:
> > 60C using thinkfan. When I run with CPUs at 100% it reaches 90C in less
> > than two minutes. For work I need to launch short (few minutes) but CPU
>
> Looks like a cracked thermal interface between the heatsink and the CPU.
> The repair is easy, but rather annoying to do: you have to replace the
> thermal compound and reseat the heatsink.
>

I finally found the paste and the time to replace it. Indeed, now it works
nicely, temperature
max always around 60C also after several minutes of intense CPU usage.

Thanks for the help!

Francesco


Re: ThinkPad fan

2016-06-18 Thread Francesco Montanari
> > I should probably clean the fans, but another thing I usually do for
> > normal
> > usage is to limit the cpu speed.
> >
> > As root:
> >
> > #echo 60 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
> >
> > Or whatever number you feel good with. Unless I am doing heavy usage,
> > I do not
> > notice performance penalties and temperatures do not get that high.
>
> This sounds like a good suggestion.

Indeed, this sounds like a great solution for me. Limiting CPUs to 80% does
not increase terribly the execution time, and with thinkfan active it stays
below 70C when running the same code that I was testing before (without
thinkfan it reaches 80C). I attach below the scripts that I am using to set
the cpu limit at boot time by enabling "systemctl enable
set_cpu_max_perf_pct".

Thanks,
Francesco




== /usr/bin/set_cpu_max_perf_pct ==
#!/bin/sh

# Set the max CPU use percentage
start() {
echo 80 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
}

# Revert back to 100 percent
stop() {
echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
}

case $1 in
start|stop) "$1" ;;
esac




== /etc/systemd/system/set_cpu_max_perf_pct.service ==
[Unit]
Description=Limit global cpu usage

[Service]
Type=oneshot
ExecStart=/usr/bin/set_cpu_max_perf_pct start
ExecStop=/usr/bin/set_cpu_max_perf_pct stop
RemainAfterExit=yes


[Install]
WantedBy=multi-user.target


Re: ThinkPad fan

2016-06-18 Thread Francesco Montanari
Hi,

Thanks for the suggestions. I tried the following, which didn't change much
the situation. Can it be that the CPUs just warm up more when getting old,
or it shouldn't matter if cleaned properly?

a) I disassembled and cleaned the fan. Fairly dusty (it's about 5 years I
have the laptop), I used a vacuum cleaner to remove the dust. I have the
impression that the fan now pushes out more air.

b) I installed and configured thinkfan (despite the buggy installation
[1]). The package description [2] says it is helpful in the case the fan is
running too much (not really my problem), but it actually provides an easy
way in general to set the fan levels for given temperature ranges [3]. In
comparison to before, the fan runs faster now when above 60C.

c) I also tried to turn on by hand the disentangled mode (~5500rpm instead
of ~4500rpm) [4].

d) I had a look to the script suggested by Tom (thanks), but didn't try it
since I managed to install thinkfan. FYI, I think that
/proc/acpi/ibm/thermal is no longer the way to get the temperature.

For "normal" use (browsing, video, music, ...) the temperature stays below
60C using thinkfan. When I run with CPUs at 100% it reaches 90C in less
than two minutes. For work I need to launch short (few minutes) but CPU
expensive runs (mostly numerical integrations) on a daily basis. I'll ask
around if someone can lend me one of those fan pads to try it out.

Thanks,
Francesco

1. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=767127
2. https://packages.debian.org/jessie/thinkfan
3. http://forums.debian.net/viewtopic.php?t=118734
4. http://www.thinkwiki.org/wiki/How_to_control_fan_speed

On Fri, Jun 17, 2016 at 7:27 PM, Sven Arvidsson <s...@whiz.se> wrote:

> On Fri, 2016-06-17 at 12:58 +0300, Francesco Montanari wrote:
> > Hi,
> >
> > I recently installed Jessie on a Lenovo ThinkPad T420. The fan usage
> > looks
> > reasonable. However, high temperatures (96 C) are reached when CPUs
> > are
> > running intensively for more than one minute or so. The fan speed at
> > those
> > temperatures is about 4500 rpm.
> >
> > Do you think it is ok, or do you suggest to force lower temperatures,
> > e.g.,
> > with thinkfan [1]?
>
> It isn't unusual with a temperature like that during load, but it
> mostly seems to happen when playing demanding games on a laptop.
>
> I wouldn't worry about it if it just happens momentarily, but long
> running tasks that stress the system like that is probably not suitable
> on a laptop.
>
> Anyway, as others have pointed out, check for dust and make sure that
> you don't impede air flow yourself (by placing it on bed for example).
>
> --
> Cheers,
> Sven Arvidsson
> http://www.whiz.se
>
>


ThinkPad fan

2016-06-17 Thread Francesco Montanari
Hi,

I recently installed Jessie on a Lenovo ThinkPad T420. The fan usage looks
reasonable. However, high temperatures (96 C) are reached when CPUs are
running intensively for more than one minute or so. The fan speed at those
temperatures is about 4500 rpm.

Do you think it is ok, or do you suggest to force lower temperatures, e.g.,
with thinkfan [1]?

Thanks,
Francesco

[1] https://packages.debian.org/jessie/thinkfan