Bug#1003653: Revision of removal of rename.ul from package util-linux

2022-03-10 Thread Dirk Kostrewa

Dear Sean

On 09.03.22 16:54, Sean Whitton wrote:

Dear Dirk,

On Wed 09 Mar 2022 at 12:59pm +01, Dirk Kostrewa wrote:


Personally, I would still prefer a "rename" entry in the alternative
system with util-linux's rename as default, since util-linux is
installed in every Debian system. I know, the syntaxes of util-linux's
rename and of Perl's rename are incompatible, but a user who wants to
use Perl's rename would probably know its syntax, would have to actively
install its package, and would then choose Perl's rename in the
alternative system.

Right, yes, but unfortunately this is off the table for reasons of
historical compatibility.

okay, I see.



If an entry in the alternative system is not wanted, for me, it would
also be fine to have access to util-linux's rename in any PATH with any
recognizable name. I would then create a soft-link, say,
/usr/local/bin/rename, that points to util-linux's rename.

Hmm, if you are planning to create a symlink, then wouldn't both (A) and
(B) be okay with you?


yes, for me, both would be okay, with a preference for (A), because, 
util-linux's other binaries are also in PATH.


Best regards,

Dirk.



Bug#1003653: Revision of removal of rename.ul from package util-linux

2022-03-09 Thread Dirk Kostrewa

Dear Sean,

first of all: many thanks to the technical committee for taking care of 
my request! This was my first request, and I am really impressed by the 
way this was discussed and handled!


Personally, I would still prefer a "rename" entry in the alternative 
system with util-linux's rename as default, since util-linux is 
installed in every Debian system. I know, the syntaxes of util-linux's 
rename and of Perl's rename are incompatible, but a user who wants to 
use Perl's rename would probably know its syntax, would have to actively 
install its package, and would then choose Perl's rename in the 
alternative system.


If an entry in the alternative system is not wanted, for me, it would 
also be fine to have access to util-linux's rename in any PATH with any 
recognizable name. I would then create a soft-link, say, 
/usr/local/bin/rename, that points to util-linux's rename.


Best regards,

Dirk.

On 08.03.22 20:58, Sean Whitton wrote:

Dear Chris, Dirk,

On Tue 08 Feb 2022 at 09:23pm +01, Helmut Grohne wrote:


We've discussed a number of possible ways to put it back (various
packages, various paths, with or without update-alternatives, with or
without Conflicts). From what you said, I understand that: [...]

Given these, we think that much of the issue can be resolved
cooperatively. To get there we (as ctte) ask you to explain how you
prefer adding the util-linux rename executable as precisely as you see
it at this stage. [...]

The ctte discussed this bug at our meeting today and determined that
there are two resolutions to this bug supported by at least one member:

(A) src:util-linux should build a binary package that ships util-linux's
 rename as "rename.ul" somewhere on PATH.

(B) src:util-linux should build a binary package that ships util-linux's
 rename, but does not install it as "rename" anywhere on PATH.
 It is not settled, at present, whether util-linux's rename should be
 provided somewhere on PATH with a name other than "rename".

Option (A) is meant to be (B) plus the additional requirement that it be
rename.ul somewhere on PATH.  Neither option says anything about whether
util-linux's rename.ul should be installed in an Essential package.

Chris, we haven't heard back from you in response to our request for
input quoted above.  We would still very much like to hear what you
think of (A) and (B) and whether you prefer some (C).  If we don't hear
back from you by the time of our next committee meeting in a month, we
will consider voting on (A) and (B).

Dirk, we would be grateful if you would comment on these two
resolutions, but we aren't going to block resolving this bug on hearing
from you.

Thanks both.


--

***
Dirk Kostrewa
Gene Center Munich
Department of Biochemistry, AG Hopfner
Ludwig-Maximilians-Universität München
Feodor-Lynen-Str. 25
D-81377 Munich
Germany
Phone:  +49-89-2180-76845
Fax:+49-89-2180-76998
E-mail:kostr...@genzentrum.lmu.de
dirk.kostr...@lmu.de
WWW:www.genzentrum.lmu.de
***



Bug#1003653: Revision of removal of rename.ul from package util-linux

2022-01-25 Thread Dirk Kostrewa

On 25/01/2022 09:16, Chris Hofstaedtler wrote:

Hi,

* Sean Whitton  [220125 00:06]:

On Mon 24 Jan 2022 at 11:33AM +01, Chris Hofstaedtler wrote:


For context, the idea is that /usr/bin/rename should become
src:util-linux' rename implementation.

That seems likely to break a great many scripts, though?

Perhaps we should ship them both under a name other than
/usr/bin/rename, such that people are prompted to update their scripts
to choose one, or create their own symlink?

Then all of this is a completely pointless exercise. Either we break
them, or it is favorable to keeping the way things are:

A very valid way of closing this discussion is saying "our
(Perl) /usr/bin/rename is great, people should use that".

Chris

Both rename programs are around for a long time and have their use 
cases, and apparently, there are users who rely on one or the other.


Say, the bsdutils package provides "rename.ul", and the perl rename 
package provides "rename.pl". Debian's alternatives system could then 
make each of them available as "/usr/bin/rename". If both get installed, 
the user could be prompted to choose a default "rename".


Would this apparently simple solution really create any problems?

Dirk.



Bug#982944: rename.ul was arbitrarily removed from util-linux citing non-existent polic

2022-01-24 Thread Dirk Kostrewa
Meanwhile, I have asked the technical committee for a revision of the 
removal of rename.ul from the util-linux package in Debian bug report 
#1003653 (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1003653).


Dirk.



Bug#1003653: Revision of removal of rename.ul from package util-linux

2022-01-13 Thread Dirk Kostrewa

Package: tech-ctte
Severity: normal

Dear Technical Committee,

the program rename.ul is a bulk file renaming program with a versatile 
and simple syntax. It is part of the public software util-linux on 
kernel.org https://www.kernel.org/pub/linux/utils/util-linux/ and is 
probably present in every Linux distribution including Debian for at 
least 14 years up to "Buster".


A user requested in Debian bug report #926637 
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926637) to include 
rename.ul in Debian's alternatives system. The package maintainer replied:


"The util-linux rename command does not implement the same (command line)
interface as the alternative(s) does, so it is not policy compliant to
add it as an alternative."

As a result, the maintainer completely removed rename.ul from the 
package util-linux without providing any further reference to this 
Debian policy.


Another Debian bug report #966468 
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=966468) to provide 
rename.ul in util-linux again was set to "WONTFIX" without giving any 
further reason.


In Debian bug report #982944 
(https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=982944 users 
complained about the arbitrary removal of rename.ul from util-linux and 
argued:


- no Debian policy could be found for the removal of rename.ul from 
util-linux based on interface differences in the alternatives system


- interface differences are part of the alternatives system, and program 
examples were given


- removal of rename.ul broke workflows existing for decades, including a 
scientific workflow at a university


To summarize: the program rename.ul of the util-linux package has been 
around for decades in Debian and probably every other Linux 
distribution. The maintainer removed this program referring to a Debian 
policy which seems not to exist. Furthermore, differences in program 
interfaces in Debian's alternative system can be found in other program 
examples and are no reason to remove such a well-established program. 
Thus, the removal of rename.ul from the util-linux package appears to be 
both unnecessary and arbitrary.


Since the maintainer did not respond to any of the user arguments in the 
above bug reports, I kindly request the technical committee to revise 
the removal of rename.ul from the package util-linux, hoping that this 
removal will be reversed.


Kind regards,

Dirk Kostrewa.


Bug#982944: rename.ul was arbitrarily removed from util-linux citing non-existent polic

2021-11-12 Thread Dirk Kostrewa
After changing the Linux distribution from CentOS to Debian "Bullseye" 
at an institute of the University of Munich, the removal of rename.ul 
from util-linux in Debian "Bullseye" broke one of our scripts used in a 
scientific workflow in research. I can't believe that this highly 
versatile and simple-to-use tool which is part of the public software at 
kernel.org (https://www.kernel.org/pub/linux/utils/util-linux/ 
) was arbitrarily 
removed by a single person's decision.


One of the strengths of Linux is the freedom of software choice. Please, 
do not cut this freedom without very good reason!


Could you please revert this decision and make rename.ul available in 
util-linux, again?


If not, is there a way to escalate this issue in Debian's package 
decision hierarchy?


Regards,

Dirk.



Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-09-04 Thread Dirk Kostrewa

Hi Salvatore,

meanwhile, Dell has replaced the mainboard of my laptop, and after that, 
both the USB over-current kernel messages and the kworker processes with 
high CPU load are gone.


Many thanks for caring about my bug report!

Best regards,

Dirk.

Am 29.08.20 um 11:26 schrieb Salvatore Bonaccorso:

Hi Dirk,

Thanks for testing that.

On Sat, Aug 29, 2020 at 11:04:43AM +0200, Dirk Kostrewa wrote:

Hi Salvatore,

I have enabled the verbose debugging mode on the command line and have
appended the first 5000 lines of the dmesg output to this e-mail, running
the current kernel from the Buster backports with the two kworker processes
with high CPU load present.

After that, I have applied your patch to this kernel and rebooted with the
patched kernel:

5.7.0-0.bpo.2-amd64 #1 SMP Debian 5.7.10-1~bpo10+1a~test (2020-08-28) x86_64
GNU/Linux

With your patch applied, the two kworker processes with high CPU load
completely disappeared!

Unfortunately I suspect this indicates either a HW fault or a HW
design error as stated in the found kernel-thread which was just
uncovered by the mentioned kernel fix (which we temporarily reverted
with the patch). I can try to ask Alan Stern.

There might be a workaround workarble for you, the issue should
disapear if you prevent the system to automatically try to suspend
usb2 root hub (but you have the same on usb1 root hub).

# echo on >/sys/bus/usb/devices/usb2/power/control

will do that for the usb2 root hub.

Regards,
Salvatore




Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-29 Thread Dirk Kostrewa

Hi Salvatore,

I have enabled the verbose debugging mode on the command line and have 
appended the first 5000 lines of the dmesg output to this e-mail, 
running the current kernel from the Buster backports with the two 
kworker processes with high CPU load present.


After that, I have applied your patch to this kernel and rebooted with 
the patched kernel:


5.7.0-0.bpo.2-amd64 #1 SMP Debian 5.7.10-1~bpo10+1a~test (2020-08-28) 
x86_64 GNU/Linux


With your patch applied, the two kworker processes with high CPU load 
completely disappeared!


A snapshot of the "top" command shows the following top 10 processes:

$ top
top - 10:54:43 up 5 min,  3 users,  load average: 0.18, 0.26, 0.13
Tasks: 225 total,   1 running, 224 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.1 sy,  0.0 ni, 99.8 id,  0.0 wa, 0.0 hi,  0.0 si,  
0.0 st

MiB Mem :  15928.9 total,  14186.5 free,    900.8 used,    841.7 buff/cache
MiB Swap:  0.0 total,  0.0 free,  0.0 used. 14711.4 avail Mem

  PID USER  PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ 
COMMAND
  344 root -51   0   0  0  0 S   0.3 0.0   0:00.10 
irq/134-iwlwifi
  425 root  20   0   0  0  0 I   0.3 0.0   0:00.09 
kworker/5:3-events
 1216 rtkit 21   1  152652   2856   2616 S   0.3 0.0   0:00.01 
rtkit-daemon
 1272 dirk  20   0   52376  17908   5460 S   0.3 0.1   0:00.02 
hp-systray

    1 root  20   0  169784  10436   7844 S   0.0 0.1   0:01.64 systemd
    2 root  20   0   0  0  0 S   0.0 0.0   0:00.00 
kthreadd

    3 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 rcu_gp
    4 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 
rcu_par_gp
    5 root  20   0   0  0  0 I   0.0 0.0   0:00.04 
kworker/0:0-events_+
    6 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 
kworker/0:0H-kblockd

...

Many thanks for looking after this issue and having found a fix for this!

Best regards,

Dirk


Am 28.08.20 um 16:33 schrieb Salvatore Bonaccorso:

hi Dirk,

On Wed, Aug 12, 2020 at 05:53:57PM +0200, Dirk Kostrewa wrote:

Hi Salvatore,

I just found out, that if none of the two USB ports is connected, there are
two kworker processes with permanently high CPU load, if one USB port is
connected and the other not, there is one such kworker process, and if both
USB ports are connected, there is no kworker process with high CPU load.
I think, this supports your suspicion that these kworker processes are
connected with the overcurrent condition for both USB ports that I also see
in the dmesg output.
What puzzles me, is that I've observed these oddly behaving kworker
processes also with the 5.6 kernel that I've tried from the Buster Backports
repository.

The kernel parameter variant did not work correctly as there are no
dynamic debug output afaics (the double quotes seem to placed in the
wrong place), please just try the setting at runtime instead:

# echo 'file drivers/usb/* +p;' > /sys/kernel/debug/dynamic_debug/control

What I was meaning is (and this is confirmed if you see the issue
issue as well with the more recent kernels), that the specified commit
actually uncovers the issue present possibly with the HW.

Similarly to you someone else, where in known case with faulty HW,
reported the following issue upstream:

https://lore.kernel.org/lkml/20200720083956.ga4...@dhcp22.suse.cz/

I would like to see if we can collect as much information as possible
and possibly crosscheck with upstream.

If build the kernel with the attached patch (that is with the commit
wich is supsected to uncover the issue), does then the issue goes
away?

You can folllow the quide in
https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s4.2.2
for the "simple patching and building" and quickly chekcing a patch.

Regards,
Salvatore


dmesg.txt.gz
Description: application/gzip


Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-21 Thread Dirk Kostrewa

Hi Salvatore,

I just want to inform you that I've installed the recent kernel from the 
Buster backports, 5.7.0-0.bpo.2-amd64 #1 SMP Debian 5.7.10-1~bpo10+1 
(2020-07-30) x86_64 GNU/Linux, and I'm still seeing the two kworker 
processes with high CPU load, probably related to the two USB ports with 
over-current condition.


Regards,

Dirk.

Am 12.08.20 um 18:05 schrieb Salvatore Bonaccorso:

Hi,

Just commenting on the following:

On Wed, Aug 12, 2020 at 05:53:57PM +0200, Dirk Kostrewa wrote:
[...]

What puzzles me, is that I've observed these oddly behaving kworker
processes also with the 5.6 kernel that I've tried from the Buster Backports
repository.

The mentioned commit, is included in the following upstream versions
(relevant for Debian): v4.19.119 (so in buster), v5.6.8 (and so the
buster-backports kernel), v5.7-rc3.

Regards,
Salvatore




Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-13 Thread Dirk Kostrewa

Hi Salavatore,

I have kernel "linux-image-4.19.0-10-amd64/stable,now 4.19.132-1 amd64" 
installed, so it should already include the mentioned commit, if I 
understand correctly (I'm a bit confused by the two different version 
numbers used by Debian). I have also tried the most recent kernel 
"linux-image-5.6.0-0.bpo.2-amd64/buster-backports 5.6.14-2~bpo10+1 
amd64". For both kernels, I see the two kworker processes with high CPU 
load.


Regards,

Dirk.

Am 12.08.20 um 18:05 schrieb Salvatore Bonaccorso:

Hi,

Just commenting on the following:

On Wed, Aug 12, 2020 at 05:53:57PM +0200, Dirk Kostrewa wrote:
[...]

What puzzles me, is that I've observed these oddly behaving kworker
processes also with the 5.6 kernel that I've tried from the Buster Backports
repository.

The mentioned commit, is included in the following upstream versions
(relevant for Debian): v4.19.119 (so in buster), v5.6.8 (and so the
buster-backports kernel), v5.7-rc3.

Regards,
Salvatore




Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-12 Thread Dirk Kostrewa

Hi Salvatore,

I just found out, that if none of the two USB ports is connected, there 
are two kworker processes with permanently high CPU load, if one USB 
port is connected and the other not, there is one such kworker process, 
and if both USB ports are connected, there is no kworker process with 
high CPU load.
I think, this supports your suspicion that these kworker processes are 
connected with the overcurrent condition for both USB ports that I also 
see in the dmesg output.
What puzzles me, is that I've observed these oddly behaving kworker 
processes also with the 5.6 kernel that I've tried from the Buster 
Backports repository.


Cheers,

Dirk.

Am 12.08.20 um 13:02 schrieb Dirk Kostrewa:

Hi Salvatore,

yesterday, I installed the kernel 5.6.0 from the Buster Backports and 
saw again a kworker process with high CPU load.
Oddly, this morning, my laptop didn't boot, so I decided to do a fresh 
install of Debian Buster 10.5.0 (image with non-free firmware because 
of my wifi card) and installed only thunderbird and vim. There is 
still one kworker process with permanently high CPU load.


I gave the dyndbg command that you told me as a kernel parameter upon 
booting and have appended the dmesg output as file dmesg.txt.gz.


Cheers,

Dirk.

Am 11.08.20 um 21:21 schrieb Salvatore Bonaccorso:

Hi Dirk,

On Tue, Aug 11, 2020 at 12:58:15PM +0200, Dirk Kostrewa wrote:

Hi Salavatore,

as an additional control, I have completely uninstalled the nvidia 
graphics

driver and repeated the kworker observations using the nouveau graphics
driver with the kernel 4.19.0-10-amd64. This time, there are even two
kworker processes constantly running with high CPU load:

$ top
top - 12:37:20 up 10 min,  4 users,  load average: 2.79, 2.54, 1.56
Tasks: 197 total,   3 running, 194 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us, 24.2 sy,  0.0 ni, 74.2 id,  0.0 wa, 0.0 hi, 1.6 
si,  0.0

st
MiB Mem :  15889.4 total,  13964.7 free,    626.8 used, 1297.9 
buff/cache
MiB Swap:  0.0 total,  0.0 free,  0.0 used. 14849.1 
avail Mem


   PID USER  PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ 
COMMAND

   164 root  20   0   0  0  0 R  80.0 0.0 8:41.67
kworker/6:2+pm
   455 root  20   0   0  0  0 R  80.0 0.0 8:28.23
kworker/2:2+pm
    22 root  20   0   0  0  0 S  20.0 0.0 2:14.82
ksoftirqd/2
    42 root  20   0   0  0  0 S  20.0 0.0 2:08.67
ksoftirqd/6
 1 root  20   0  169644  10212   7796 S   0.0 0.1 0:01.52 
systemd
 2 root  20   0   0  0  0 S   0.0 0.0 0:00.00 
kthreadd
 3 root   0 -20   0  0  0 I   0.0 0.0 0:00.00 
rcu_gp

 4 root   0 -20   0  0  0 I   0.0 0.0 0:00.00
rcu_par_gp
 6 root   0 -20   0  0  0 I   0.0 0.0 0:00.00
kworker/0:0H-kblockd
 7 root  20   0   0  0  0 I   0.0 0.0 0:00.05
kworker/u16:0-event+

The stacks of the two kworker processes show the same output:

[<0>] 0x

I have appended the top 5000 lines tracing as a compressed ascii file
out-cut.txt,gz and the dmesg output as compressed ascii file 
dmesg.txt.gz.


I hope, this helps to find out where the problem with the high CPU 
load of

the kworker processes come from.

Thanks this is very helpful.

I suspect what you are seeing is an issue with the usb hubport present
before but now uncovered due to the upstream change e9fb08d617bf
("xhci: prevent bus suspend if a roothub port detected a over-current
condition")[1], which was as well backported to v4.19.y in 4.19.119.

Can you add some dynamic debugging on the 'drivers/usb/'[2] ideally at
boot time. On runtime it is

# echo 'file drivers/usb/* +p;' > 
/sys/kernel/debug/dynamic_debug/control


or as kernel parameter to have enable the debug messages at boot time
already:

dyndbg="file drivers/usb/* +p;"

Can you attach the dmesg with the enabled debugging?

Regards,
Salvatore

  [1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9fb08d617bfae5471d902112667d0eeb9dee3c4
  [2] 
https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html




Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-12 Thread Dirk Kostrewa

Hi Salvatore,

yesterday, I installed the kernel 5.6.0 from the Buster Backports and 
saw again a kworker process with high CPU load.
Oddly, this morning, my laptop didn't boot, so I decided to do a fresh 
install of Debian Buster 10.5.0 (image with non-free firmware because of 
my wifi card) and installed only thunderbird and vim. There is still one 
kworker process with permanently high CPU load.


I gave the dyndbg command that you told me as a kernel parameter upon 
booting and have appended the dmesg output as file dmesg.txt.gz.


Cheers,

Dirk.

Am 11.08.20 um 21:21 schrieb Salvatore Bonaccorso:

Hi Dirk,

On Tue, Aug 11, 2020 at 12:58:15PM +0200, Dirk Kostrewa wrote:

Hi Salavatore,

as an additional control, I have completely uninstalled the nvidia graphics
driver and repeated the kworker observations using the nouveau graphics
driver with the kernel 4.19.0-10-amd64. This time, there are even two
kworker processes constantly running with high CPU load:

$ top
top - 12:37:20 up 10 min,  4 users,  load average: 2.79, 2.54, 1.56
Tasks: 197 total,   3 running, 194 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us, 24.2 sy,  0.0 ni, 74.2 id,  0.0 wa, 0.0 hi,  1.6 si,  0.0
st
MiB Mem :  15889.4 total,  13964.7 free,    626.8 used, 1297.9 buff/cache
MiB Swap:  0.0 total,  0.0 free,  0.0 used. 14849.1 avail Mem

   PID USER  PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ COMMAND
   164 root  20   0   0  0  0 R  80.0 0.0   8:41.67
kworker/6:2+pm
   455 root  20   0   0  0  0 R  80.0 0.0   8:28.23
kworker/2:2+pm
    22 root  20   0   0  0  0 S  20.0 0.0   2:14.82
ksoftirqd/2
    42 root  20   0   0  0  0 S  20.0 0.0   2:08.67
ksoftirqd/6
     1 root  20   0  169644  10212   7796 S   0.0 0.1   0:01.52 systemd
     2 root  20   0   0  0  0 S   0.0 0.0   0:00.00 kthreadd
     3 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 rcu_gp
     4 root   0 -20   0  0  0 I   0.0 0.0   0:00.00
rcu_par_gp
     6 root   0 -20   0  0  0 I   0.0 0.0   0:00.00
kworker/0:0H-kblockd
     7 root  20   0   0  0  0 I   0.0 0.0   0:00.05
kworker/u16:0-event+

The stacks of the two kworker processes show the same output:

[<0>] 0x

I have appended the top 5000 lines tracing as a compressed ascii file
out-cut.txt,gz and the dmesg output as compressed ascii file dmesg.txt.gz.

I hope, this helps to find out where the problem with the high CPU load of
the kworker processes come from.

Thanks this is very helpful.

I suspect what you are seeing is an issue with the usb hubport present
before but now uncovered due to the upstream change e9fb08d617bf
("xhci: prevent bus suspend if a roothub port detected a over-current
condition")[1], which was as well backported to v4.19.y in 4.19.119.

Can you add some dynamic debugging on the 'drivers/usb/'[2] ideally at
boot time. On runtime it is

# echo 'file drivers/usb/* +p;' > /sys/kernel/debug/dynamic_debug/control

or as kernel parameter to have enable the debug messages at boot time
already:

dyndbg="file drivers/usb/* +p;"

Can you attach the dmesg with the enabled debugging?

Regards,
Salvatore

  [1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9fb08d617bfae5471d902112667d0eeb9dee3c4
  [2] 
https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html


dmesg.txt.gz
Description: application/gzip


Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-11 Thread Dirk Kostrewa

Hi Salavatore,

as an additional control, I have completely uninstalled the nvidia 
graphics driver and repeated the kworker observations using the nouveau 
graphics driver with the kernel 4.19.0-10-amd64. This time, there are 
even two kworker processes constantly running with high CPU load:


$ top
top - 12:37:20 up 10 min,  4 users,  load average: 2.79, 2.54, 1.56
Tasks: 197 total,   3 running, 194 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us, 24.2 sy,  0.0 ni, 74.2 id,  0.0 wa, 0.0 hi,  1.6 si,  
0.0 st

MiB Mem :  15889.4 total,  13964.7 free,    626.8 used, 1297.9 buff/cache
MiB Swap:  0.0 total,  0.0 free,  0.0 used. 14849.1 avail Mem

  PID USER  PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ 
COMMAND
  164 root  20   0   0  0  0 R  80.0 0.0   8:41.67 
kworker/6:2+pm
  455 root  20   0   0  0  0 R  80.0 0.0   8:28.23 
kworker/2:2+pm
   22 root  20   0   0  0  0 S  20.0 0.0   2:14.82 
ksoftirqd/2
   42 root  20   0   0  0  0 S  20.0 0.0   2:08.67 
ksoftirqd/6

    1 root  20   0  169644  10212   7796 S   0.0 0.1   0:01.52 systemd
    2 root  20   0   0  0  0 S   0.0 0.0   0:00.00 
kthreadd

    3 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 rcu_gp
    4 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 
rcu_par_gp
    6 root   0 -20   0  0  0 I   0.0 0.0   0:00.00 
kworker/0:0H-kblockd
    7 root  20   0   0  0  0 I   0.0 0.0   0:00.05 
kworker/u16:0-event+


The stacks of the two kworker processes show the same output:

[<0>] 0x

I have appended the top 5000 lines tracing as a compressed ascii file 
out-cut.txt,gz and the dmesg output as compressed ascii file dmesg.txt.gz.


I hope, this helps to find out where the problem with the high CPU load 
of the kworker processes come from.


Cheers,

Dirk.

Am 02.08.20 um 18:22 schrieb Salvatore Bonaccorso:

Hi Dirk,

On Sun, Aug 02, 2020 at 03:44:09PM +0200, Salvatore Bonaccorso wrote:

Control: tags -1 + moreinfo

Hi Dirk

On Sun, Aug 02, 2020 at 10:00:27AM +0200, Dirk Kostrewa wrote:

Package: src:linux
Version: 4.19.132-1
Severity: normal

Dear Maintainer,

after booting the kernel 4.19.0-10-amd64, there is a kworker process running
with a permanent high CPU load of almost 90% as reported by the "top"
command:

$ top
top - 09:48:19 up 0 min,  4 users,  load average: 1.91, 0.58, 0.20
Tasks: 218 total,   2 running, 216 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us, 12.4 sy,  0.0 ni, 84.5 id,  0.0 wa,  0.0 hi,  2.3 si,  0.0
st
MiB Mem :  15889.4 total,  14173.1 free,    889.3 used,    827.0 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  14677.7 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
    64 root      20   0       0      0      0 R  86.7   0.0 0:47.41
kworker/0:2+pm
     9 root      20   0       0      0      0 S  20.0   0.0 0:08.84
ksoftirqd/0
   364 root     -51   0       0      0      0 S   6.7   0.0 0:00.50
irq/126-nvidia
  1177 dirk      20   0 2921696 122848  94268 S   6.7   0.8 0:02.23 kwin_x11
     1 root      20   0  169652  10280   7740 S   0.0   0.1 0:01.56 systemd
     2 root      20   0       0      0      0 S   0.0   0.0 0:00.00 kthreadd
...

The expected result after booting the kernel 4.19.0-10-amd64 is a kworker
process with a CPU load close to 0%.

As a control, booting the previous kernel 4.19.0-9-amd64 does not show a
high CPU load for the kworker process. Instead, the kworker CPU load
reported by the "top" command is 0.0%.

Therefore, I suspect a bug in the kernel 4.19.0-10-amd64.

Neither "dmesg" nor "journalctl -b" show any messages containing "kworker".

I am using Debian/GNU Linux 10.5 with kernel 4.19.0-10-amd64 and libc6:amd64
2.28-10.

If you need more information, I would be happy to provide it.

To find out what could be the cause, could you have a look at
https://www.kernel.org/doc/html/latest/core-api/workqueue.html#debugging
this could help determining isolating why the kworker goes crazy.

Please as well to the above one additional thing: Can you reproduce
the issue when the kernel does not get tained? So without loading the
propriertary, out-of-tree modules.

This is particularly important if the issue can be tracked down, found
in upstream and needs to be reported upstream.

Regards,
Salvatore


dmesg.txt.gz
Description: application/gzip


out-cut.txt.gz
Description: application/gzip


Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-02 Thread Dirk Kostrewa

Hi Salvatore,

I have removed the xorg.conf with the Nvidia graphics driver and any 
nvidia-related *.conf files in /etc/modprobe.d/, and I have rebooted the 
laptop. The following output should show, that only the default nouveau 
driver is loaded:


# lsmod | grep nvidia

# lsmod | grep nouveau
nouveau  2179072  0
ttm   131072  1 nouveau
i2c_algo_bit   16384  2 i915,nouveau
drm_kms_helper    208896  2 i915,nouveau
mxm_wmi    16384  1 nouveau
drm   495616  12 drm_kms_helper,i915,ttm,nouveau
wmi    28672  6 
dell_wmi,wmi_bmof,dell_smbios,dell_wmi_descriptor,mxm_wmi,nouveau

video  45056  4 dell_wmi,dell_laptop,i915,nouveau
button 16384  1 nouveau

# lspci -k | egrep 'VGA|3D' -A2
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 
(rev 06)

    Subsystem: Dell HD Graphics 530
    Kernel driver in use: i915
--
01:00.0 3D controller: NVIDIA Corporation GM107GLM [Quadro M1000M] (rev a2)
    Subsystem: Dell GM107GLM [Quadro M1000M]
    Kernel driver in use: nouveau

# dmesg | grep -i nvidia
[    4.282530] nouveau :01:00.0: NVIDIA GM107 (117310a2)
[    4.547712] audit: type=1400 audit(1596389563.639:8): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="nvidia_modprobe" pid=543 comm="apparmor_parser"
[    4.547714] audit: type=1400 audit(1596389563.639:9): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="nvidia_modprobe//kmod" pid=543 comm="apparmor_parser"

[    5.944911] nvidia: loading out-of-tree module taints kernel.
[    5.944918] nvidia: module license 'NVIDIA' taints kernel.
[    5.949482] nvidia: module verification failed: signature and/or 
required key missing - tainting kernel
[    5.962949] nvidia-nvlink: Nvlink Core is being initialized, major 
device number 241
[    5.963181] NVRM: The NVIDIA probe routine was not called for 1 
device(s).

   NVRM: nouveau, rivafb, nvidiafb or rivatv
   NVRM: was loaded and obtained ownership of the NVIDIA 
device(s).

   NVRM: driver(s)), then try loading the NVIDIA kernel module
[    5.963182] NVRM: No NVIDIA graphics adapter probed!
[    6.005267] nvidia-nvlink: Unregistered the Nvlink Core, major device 
number 241
[    6.075128] nvidia-nvlink: Nvlink Core is being initialized, major 
device number 241
[    6.075448] NVRM: The NVIDIA probe routine was not called for 1 
device(s).

   NVRM: nouveau, rivafb, nvidiafb or rivatv
   NVRM: was loaded and obtained ownership of the NVIDIA 
device(s).

   NVRM: driver(s)), then try loading the NVIDIA kernel module
[    6.075449] NVRM: No NVIDIA graphics adapter probed!
[    6.097310] nvidia-nvlink: Unregistered the Nvlink Core, major device 
number 241


Apparently, the nvidia driver was loaded first, and after that, the 
nouveau driver took over.


Here is the "top" result, again with a permanent high CPU load for a 
kworker process:


# top
top - 19:50:57 up 18 min,  4 users,  load average: 1,26, 1,22, 0,93
Tasks: 198 total,   2 running, 196 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,0 us, 11,3 sy,  0,0 ni, 87,1 id,  0,0 wa,  0,0 hi, 1,6 si,  
0,0 st

MiB Mem :  15889,5 total,  13903,9 free,    808,5 used,   1177,0 buff/cache
MiB Swap:  0,0 total,  0,0 free,  0,0 used.  14617,1 avail Mem

  PID USER  PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
   72 root  20   0   0  0  0 R  86,7   0,0 15:23.97 
kworker/7:1+pm
   47 root  20   0   0  0  0 S  13,3   0,0 2:52.21 
ksoftirqd/7

  684 root  20   0  505356 126896 102732 S   6,7   0,8 0:20.77 Xorg
    1 root  20   0  169624  10312   7880 S   0,0   0,1 0:01.34 systemd
    2 root  20   0   0  0  0 S   0,0   0,0 0:00.00 
kthreadd


Here is the stack of PID 72:

# cat /proc/72/stack
[<0>] 0x

The file with a few seconds tracing, cut after line 5000 and compressed, 
is attached as "out-no-nvidia.txt.gz".


Please, let me know, whether my way of not loading the nvidia driver was 
sufficient or not. If it is required to completely uninstall the Nvidia 
driver for a really untainted system, I will do it, but would need more 
time for this.


Regards,

Dirk.

Am 02.08.20 um 18:22 schrieb Salvatore Bonaccorso:


Hi Dirk,

On Sun, Aug 02, 2020 at 03:44:09PM +0200, Salvatore Bonaccorso wrote:

Control: tags -1 + moreinfo

Hi Dirk

On Sun, Aug 02, 2020 at 10:00:27AM +0200, Dirk Kostrewa wrote:

Package: src:linux
Version: 4.19.132-1
Severity: normal

Dear Maintainer,

after booting the kernel 4.19.0-10-amd64, there is a kworker process running
with a permanent high CPU load of almost 90% as reported by the "top"
command:

$ top
top - 09:48:19 up 0 min,  4 users,  load average: 1.

Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-02 Thread Dirk Kostrewa

Hi Salvatore,

thank you for taking care of this!

I first did the tracing for a few seconds, and I have appended the 
compressed output "out.txt.gz", cut after line 5000, to this e-mail. 
Since some "nvidia"-related processes also appear, I want to inform you 
that I have an Optimus laptop where the Nvidia GPU renders images and 
the integrated Intel GPU sends the images to the monitor, just in case.


I also tried the stack trace, but was not sure, whether I did it right - 
so, this is what I did:


# top

top - 16:29:42 up 7 min,  3 users, load average: 1,82, 1,52, 0,80
Tasks: 200 total,   2 running, 198 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,5 us, 12,4 sy,  0,0 ni, 86,5 id,  0,0 wa, 0,0 hi,  0,5 si,  
0,0 st

MiB Mem :  15889,4 total,  13390,9 free,   1263,3 used, 1235,2 buff/cache
MiB Swap:  0,0 total,  0,0 free,  0,0 used. 14265,6 avail Mem

  PID USER  PR  NI    VIRT    RES    SHR S  %CPU %MEM TIME+ 
COMMAND
   70 root  20   0   0  0  0 R  84,0 0,0   6:23.39 
kworker/4:1+pm
   32 root  20   0   0  0  0 S  16,0 0,0   1:12.32 
ksoftirqd/4

  761 root  20   0  349132 104820  67088 S   3,7 0,6   0:21.44 Xorg
...

I saw the kworker process with PID 70 and thus looked at the stack of 
this process:


# cat /proc/70/stack
[<0>] usb_start_wait_urb+0x65/0x160 [usbcore]
[<0>] usb_control_msg+0xdd/0x140 [usbcore]
[<0>] set_port_feature+0x30/0x40 [usbcore]
[<0>] hub_suspend+0x1e3/0x250 [usbcore]
[<0>] usb_suspend_both+0x9d/0x230 [usbcore]
[<0>] usb_runtime_suspend+0x2a/0x70 [usbcore]
[<0>] __rpm_callback+0xc7/0x200
[<0>] rpm_callback+0x1f/0x70
[<0>] rpm_suspend+0x138/0x670
[<0>] __pm_runtime_suspend+0x41/0x80
[<0>] usb_runtime_idle+0x2d/0x40 [usbcore]
[<0>] __rpm_callback+0xc7/0x200
[<0>] rpm_idle+0xa5/0x310
[<0>] pm_runtime_work+0x73/0x90
[<0>] process_one_work+0x1a7/0x3a0
[<0>] worker_thread+0x30/0x390
[<0>] kthread+0x112/0x130
[<0>] ret_from_fork+0x35/0x40
[<0>] 0x

I hope, this was right. If I can give you any more information, please, 
let me know.


Regards,

Dirk.

Am 02.08.20 um 15:44 schrieb Salvatore Bonaccorso:

Control: tags -1 + moreinfo

Hi Dirk

On Sun, Aug 02, 2020 at 10:00:27AM +0200, Dirk Kostrewa wrote:

Package: src:linux
Version: 4.19.132-1
Severity: normal

Dear Maintainer,

after booting the kernel 4.19.0-10-amd64, there is a kworker process running
with a permanent high CPU load of almost 90% as reported by the "top"
command:

$ top
top - 09:48:19 up 0 min,  4 users,  load average: 1.91, 0.58, 0.20
Tasks: 218 total,   2 running, 216 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us, 12.4 sy,  0.0 ni, 84.5 id,  0.0 wa,  0.0 hi,  2.3 si,  0.0
st
MiB Mem :  15889.4 total,  14173.1 free,    889.3 used,    827.0 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  14677.7 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
    64 root      20   0       0      0      0 R  86.7   0.0 0:47.41
kworker/0:2+pm
     9 root      20   0       0      0      0 S  20.0   0.0 0:08.84
ksoftirqd/0
   364 root     -51   0       0      0      0 S   6.7   0.0 0:00.50
irq/126-nvidia
  1177 dirk      20   0 2921696 122848  94268 S   6.7   0.8 0:02.23 kwin_x11
     1 root      20   0  169652  10280   7740 S   0.0   0.1 0:01.56 systemd
     2 root      20   0       0      0      0 S   0.0   0.0 0:00.00 kthreadd
...

The expected result after booting the kernel 4.19.0-10-amd64 is a kworker
process with a CPU load close to 0%.

As a control, booting the previous kernel 4.19.0-9-amd64 does not show a
high CPU load for the kworker process. Instead, the kworker CPU load
reported by the "top" command is 0.0%.

Therefore, I suspect a bug in the kernel 4.19.0-10-amd64.

Neither "dmesg" nor "journalctl -b" show any messages containing "kworker".

I am using Debian/GNU Linux 10.5 with kernel 4.19.0-10-amd64 and libc6:amd64
2.28-10.

If you need more information, I would be happy to provide it.

To find out what could be the cause, could you have a look at
https://www.kernel.org/doc/html/latest/core-api/workqueue.html#debugging
this could help determining isolating why the kworker goes crazy.

Regards,
Salvatore


out.txt.gz
Description: application/gzip


Bug#966703: linux-image-4.19.0-10-amd64: kworker process with permanent high CPU load

2020-08-02 Thread Dirk Kostrewa

Package: src:linux
Version: 4.19.132-1
Severity: normal

Dear Maintainer,

after booting the kernel 4.19.0-10-amd64, there is a kworker process 
running with a permanent high CPU load of almost 90% as reported by the 
"top" command:


$ top
top - 09:48:19 up 0 min,  4 users,  load average: 1.91, 0.58, 0.20
Tasks: 218 total,   2 running, 216 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us, 12.4 sy,  0.0 ni, 84.5 id,  0.0 wa,  0.0 hi,  2.3 si, 
 0.0 st

MiB Mem :  15889.4 total,  14173.1 free,    889.3 used,    827.0 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  14677.7 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM TIME+ COMMAND
   64 root      20   0       0      0      0 R  86.7   0.0 0:47.41 
kworker/0:2+pm
    9 root      20   0       0      0      0 S  20.0   0.0 0:08.84 
ksoftirqd/0
  364 root     -51   0       0      0      0 S   6.7   0.0 0:00.50 
irq/126-nvidia
 1177 dirk      20   0 2921696 122848  94268 S   6.7   0.8 0:02.23 
kwin_x11

    1 root      20   0  169652  10280   7740 S   0.0   0.1 0:01.56 systemd
    2 root      20   0       0      0      0 S   0.0   0.0 0:00.00 
kthreadd

...

The expected result after booting the kernel 4.19.0-10-amd64 is a 
kworker process with a CPU load close to 0%.


As a control, booting the previous kernel 4.19.0-9-amd64 does not show a 
high CPU load for the kworker process. Instead, the kworker CPU load 
reported by the "top" command is 0.0%.


Therefore, I suspect a bug in the kernel 4.19.0-10-amd64.

Neither "dmesg" nor "journalctl -b" show any messages containing "kworker".

I am using Debian/GNU Linux 10.5 with kernel 4.19.0-10-amd64 and 
libc6:amd64 2.28-10.


If you need more information, I would be happy to provide it.

Cheers,

Dirk.

-- Package-specific info:
** Version:
Linux version 4.19.0-10-amd64 (debian-ker...@lists.debian.org 
) (gcc version 8.3.0 (Debian 
8.3.0-6)) #1 SMP Debian 4.19.132-1 (2020-07-24)


** Command line:
BOOT_IMAGE=/boot/vmlinuz-4.19.0-10-amd64 
root=UUID=7eb1c27f-5474-41cb-a4fc-de2944149287 ro quiet


** Tainted: PWOE (12801)
 * Proprietary module has been loaded.
 * Taint on warning.
 * Out-of-tree module has been loaded.
 * Unsigned module has been loaded.

** Kernel log:
Unable to read kernel log; any relevant messages should be attached

** Model information
sys_vendor: Dell Inc.
product_name: Precision 5510
product_version:
chassis_vendor: Dell Inc.
chassis_version:
bios_vendor: Dell Inc.
bios_version: 1.13.1
board_vendor: Dell Inc.
board_name: 0N8J4R
board_version: A00

** Loaded modules:
rfcomm
ctr
ccm
cmac
bnep
snd_hda_codec_hdmi
arc4
intel_rapl
dell_rbtn
iwlmvm
nls_ascii
nls_cp437
vfat
fat
snd_hda_codec_realtek
x86_pkg_temp_thermal
fuse
intel_powerclamp
mac80211
snd_hda_codec_generic
coretemp
mei_wdt
btusb
btrtl
btbcm
kvm_intel
btintel
dell_laptop
dell_wmi
bluetooth
kvm
iwlwifi
dell_smbios
snd_hda_intel
irqbypass
dcdbas
crct10dif_pclmul
crc32_pclmul
snd_hda_codec
sg
dell_smm_hwmon
hid_multitouch
joydev
wmi_bmof
dell_wmi_descriptor
ghash_clmulni_intel
drbg
snd_hda_core
serio_raw
cfg80211
ansi_cprng
intel_cstate
snd_hwdep
efi_pstore
snd_pcm
snd_timer
ecdh_generic
intel_uncore
snd
rtsx_pci_ms
mei_me
nvidia_drm(POE)
iTCO_wdt
memstick
intel_rapl_perf
rfkill
efivars
pcspkr
soundcore
idma64
pcc_cpufreq
iTCO_vendor_support
mei
intel_pch_thermal
nvidia_modeset(POE)
processor_thermal_device
tpm_tis
intel_soc_dts_iosf
tpm_tis_core
tpm
battery
rng_core
int3403_thermal
intel_hid
dell_smo8800
evdev
int3400_thermal
int3402_thermal
acpi_thermal_rel
sparse_keymap
int340x_thermal_zone
acpi_pad
ac
nvidia(POE)
ipmi_devintf
ipmi_msghandler
parport_pc
ppdev
lp
parport
efivarfs
ip_tables
x_tables
autofs4
ext4
crc16
mbcache
jbd2
crc32c_generic
fscrypto
ecb
usbhid
hid_generic
sd_mod
i915
crc32c_intel
i2c_designware_platform
i2c_designware_core
rtsx_pci_sdmmc
xhci_pci
i2c_algo_bit
mmc_core
xhci_hcd
drm_kms_helper
ahci
libahci
libata
aesni_intel
drm
mxm_wmi
aes_x86_64
psmouse
usbcore
i2c_i801
crypto_simd
scsi_mod
cryptd
glue_helper
i2c_hid
rtsx_pci
intel_lpss_pci
hid
intel_lpss
mfd_core
usb_common
thermal
fan
video
wmi
button

** PCI devices:
00:00.0 Host bridge [0600]: Intel Corporation Skylake Host Bridge/DRAM 
Registers [8086:1910] (rev 07)
Subsystem: Dell Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host 
Bridge/DRAM Registers [1028:06e5]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 
Latency: 0
Capabilities: 
Kernel driver in use: skl_uncore

00:01.0 PCI bridge [0604]: Intel Corporation Skylake PCIe Controller 
(x16) [8086:1901] (rev 07) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 
Latency: 0
Interrupt: pin A routed to IRQ 16
Bus: primary=00, secondary=01,