Re: cli64 CPU segfaults

2024-01-30 Thread Adam Weremczuk

Thanks everyone for useful feedback :)

On 29/01/2024 21:05, Gremlin wrote:

On 1/29/24 14:35, Michael Kjörling wrote:
On 29 Jan 2024 19:20 +, from ad...@matrixscience.com (Adam 
Weremczuk):

I have 2 bare metal Debian 12.4 servers with fairly new Intel CPUs and
plenty of memory.

On both, dmesg continuously reports:

(...)
[Mon Jan 29 12:13:00 2024] cli64[1666090]: segfault at 0 ip 
0040dd3b
sp 7ffc2bfba630 error 4 in cli64[40+18a000] likely on CPU 41 
(core

17, socket 0)
(...)


What's cli64? A package search comes up empty for me.

https://packages.debian.org/search?searchon=contents=cli64=exactfilename=bookworm=any 






https://www.advancedclustering.com/act_kb/what-is-cli64/







Re: cli64 CPU segfaults

2024-01-29 Thread Gremlin

On 1/29/24 14:35, Michael Kjörling wrote:

On 29 Jan 2024 19:20 +, from ad...@matrixscience.com (Adam Weremczuk):

I have 2 bare metal Debian 12.4 servers with fairly new Intel CPUs and
plenty of memory.

On both, dmesg continuously reports:

(...)
[Mon Jan 29 12:13:00 2024] cli64[1666090]: segfault at 0 ip 0040dd3b
sp 7ffc2bfba630 error 4 in cli64[40+18a000] likely on CPU 41 (core
17, socket 0)
(...)


What's cli64? A package search comes up empty for me.

https://packages.debian.org/search?searchon=contents=cli64=exactfilename=bookworm=any




https://www.advancedclustering.com/act_kb/what-is-cli64/





Re: cli64 CPU segfaults

2024-01-29 Thread Michael Kjörling
On 29 Jan 2024 19:20 +, from ad...@matrixscience.com (Adam Weremczuk):
> I have 2 bare metal Debian 12.4 servers with fairly new Intel CPUs and
> plenty of memory.
> 
> On both, dmesg continuously reports:
> 
> (...)
> [Mon Jan 29 12:13:00 2024] cli64[1666090]: segfault at 0 ip 0040dd3b
> sp 7ffc2bfba630 error 4 in cli64[40+18a000] likely on CPU 41 (core
> 17, socket 0)
> (...)

What's cli64? A package search comes up empty for me.

https://packages.debian.org/search?searchon=contents=cli64=exactfilename=bookworm=any


> $ sudo dmesg -T | grep cli64 | wc -l

Useless use of wc. :-) "grep -c" will show a count of matching lines.

-- 
Michael Kjörling  https://michael.kjorling.se
“Remember when, on the Internet, nobody cared that you were a dog?”



Re: cli64 CPU segfaults

2024-01-29 Thread Michael Stone

On Mon, Jan 29, 2024 at 07:20:14PM +, Adam Weremczuk wrote:

I have 2 bare metal Debian 12.4 servers with fairly new Intel CPUs and plenty
of memory.

On both, dmesg continuously reports:

(...)
[Mon Jan 29 12:13:00 2024] cli64[1666090]: segfault at 0 ip 0040dd3b sp
7ffc2bfba630 error 4 in cli64[40+18a000] likely on CPU 41 (core 17,
socket 0)


Well, what is cli64? I don't think it came with debian, so you'd have to 
start by looking at what that program is doing.


(If you're not sure, I'm going to guess it's the areca raid management 
software, but it's not a super distinct filename.)




cli64 CPU segfaults

2024-01-29 Thread Adam Weremczuk

Hi all,

I have 2 bare metal Debian 12.4 servers with fairly new Intel CPUs and 
plenty of memory.


On both, dmesg continuously reports:

(...)
[Mon Jan 29 12:13:00 2024] cli64[1666090]: segfault at 0 ip 
0040dd3b sp 7ffc2bfba630 error 4 in cli64[40+18a000] 
likely on CPU 41 (core 17, socket 0)
[Mon Jan 29 12:13:00 2024] Code: 48 8b 45 c8 8b 80 cc 00 00 00 48 8b 55 
c8 48 98 0f b6 44 42 4d 0f b6 f0 bf a8 0a 79 00 e8 95 1b 01 00 48 89 45 
f0 48 8b 45 f0 <48> 8b 00 48 83 c0 10 48 8b 00 48 8b 7d f0 be b6 0a 41 
00 ff d0 8b
[Mon Jan 29 12:19:01 2024] cli64[1667727]: segfault at 0 ip 
0040dd3b sp 7ffde94347f0 error 4 in cli64[40+18a000] 
likely on CPU 16 (core 16, socket 0)
[Mon Jan 29 12:19:01 2024] Code: 48 8b 45 c8 8b 80 cc 00 00 00 48 8b 55 
c8 48 98 0f b6 44 42 4d 0f b6 f0 bf a8 0a 79 00 e8 95 1b 01 00 48 89 45 
f0 48 8b 45 f0 <48> 8b 00 48 83 c0 10 48 8b 00 48 8b 7d f0 be b6 0a 41 
00 ff d0 8b
[Mon Jan 29 12:24:02 2024] cli64[1669594]: segfault at 0 ip 
0040dd3b sp 7ffd305bebe0 error 4 in cli64[40+18a000] 
likely on CPU 40 (core 16, socket 0)
[Mon Jan 29 12:24:02 2024] Code: 48 8b 45 c8 8b 80 cc 00 00 00 48 8b 55 
c8 48 98 0f b6 44 42 4d 0f b6 f0 bf a8 0a 79 00 e8 95 1b 01 00 48 89 45 
f0 48 8b 45 f0 <48> 8b 00 48 83 c0 10 48 8b 00 48 8b 7d f0 be b6 0a 41 
00 ff d0 8b
[Mon Jan 29 12:29:03 2024] cli64[1675152]: segfault at 0 ip 
0040dd3b sp 7ffddbe853b0 error 4 in cli64[40+18a000] 
likely on CPU 43 (core 19, socket 0)
[Mon Jan 29 12:29:03 2024] Code: 48 8b 45 c8 8b 80 cc 00 00 00 48 8b 55 
c8 48 98 0f b6 44 42 4d 0f b6 f0 bf a8 0a 79 00 e8 95 1b 01 00 48 89 45 
f0 48 8b 45 f0 <48> 8b 00 48 83 c0 10 48 8b 00 48 8b 7d f0 be b6 0a 41 
00 ff d0 8b

(...)

$ sudo dmesg -T | grep cli64 | wc -l
1349

Other than that, they seem to be running ok.

I don't see it on similar, AMD powered kits.

Somebody suggested a faulty memory module. Or software trying to access 
a restricted part of the memory. I'm not convinced.


Any ideas or hints?

Cheers,
Adam