Triode wrote:
> Yes core dumps are only when the process crashes - but I thought this
> was the problem case? Is the NAS still running but squeezelite not
> longer running? Also can you check the dmesg for any errors and if the
> NAS has rebooted?
1 of 3 squeezelite processes on different RasPis died/lost contact to
LMS in the last 9 hours.
On the Pi the process is in ps ax:
Code:
--------------------
5596 ? Rsl 442:00 /root/bin/squeezelite -m 00 00 00 00 00 12 -n
Berliner Zimmer -d all debug -f /root/debug.log -o hw:CARD=0,DEV=0
--------------------
top on RasPi gives
Code:
--------------------
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5596 root 20 0 12240 7124 1312 R 98.6 1.6 443:45.18 squeezelite
--------------------
No core dump, process is still running.
Debug log (only the end, file is 21MB+ right now; can fetch other parts
if needed):
Code:
--------------------
[09:11:29.967503] sendSTAT:134 STAT: STMt
[09:12:16.211407] process:347 strm
[09:12:16.211745] process_strm:202 strm command t
[09:12:16.211905] sendSTAT:134 STAT: STMt
[09:12:16.219030] process:347 strm
[09:12:16.219372] process_strm:202 strm command t
[09:12:16.219528] sendSTAT:134 STAT: STMt
[13:17:19.446704] process:347 strm
[13:17:19.447079] process_strm:202 strm command t
[13:17:19.447284] sendSTAT:134 STAT: STMt
--------------------
Uptime of the NAS
Code:
--------------------
DiskStation> uptime
17:24:14 up 3 days, 5:12, load average: 0.14, 0.19, 0.12
--------------------
dmesg (other boot messages cut) of NAS:
Code:
--------------------
[ 23.210000] usbcore: registered new interface driver usbhid
[ 23.210000] usbhid: v2.6:USB HID core driver
[ 25.120000] eth0: no IPv6 routers present
[ 37.760000] loop: module loaded
[ 38.490000] Slow work thread pool: Starting up
[ 38.920000] Slow work thread pool: Ready
[ 47.750000] findhostd uses obsolete (PF_INET,SOCK_PACKET)
[ 63.490000] usbcore: registered new interface driver snd-usb-audio
(final boot message)
[14187.000000] eth0: link down
[14189.930000] eth0: link up, full duplex, speed 1 Gbps
[14199.570000] eth0: link down
[14202.460000] eth0: link up, full duplex, speed 1 Gbps
--------------------
dmesg/uptime of RasPi
Code:
--------------------
[ 13.618724] systemd[1]: Starting Journal Service...
[ 13.633236] systemd[1]: Started Journal Service.
[ 13.633597] systemd[1]: Starting Syslog.
[ 13.639372] systemd[1]: Reached target Syslog.
[ 13.883550] systemd-udevd[48]: starting version 196
[ 14.726458] bcm2708_i2c bcm2708_i2c.0: BSC0 Controller at 0x20205000 (irq
79) (baudrate 100k)
[ 14.726681] bcm2708_i2c bcm2708_i2c.1: BSC1 Controller at 0x20804000 (irq
79) (baudrate 100k)
[ 14.751572] bcm2708_spi bcm2708_spi.0: SPI Controller at 0x20204000 (irq
80)
[ 16.317886] usbcore: registered new interface driver snd-usb-audio
[ 16.476596] systemd-journald[52]: Received SIGUSR1
[ 16.588760] systemd-journald[52]: File
/var/log/journal/5c49cb5bf08724338eb09a6900092bb0/system.journal corrupted or
uncleanly shut down, renaming and replacing.
[ 16.905064] usbcore: registered new interface driver rtl8192cu
[ 18.584094] NET: Registered protocol family 10
[ 18.587981] ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 35.205941] ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 42.112272] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[ 53.161029] wlan0: no IPv6 routers present
[root@BerlinerZimmer ~]# uptime
16:38:48 up 23:42, 1 user, load average: 1.50, 1.46, 1.44
--------------------
Query of LMS from Notebook now gives:
Code:
--------------------
$ (echo "players 0"; sleep 1; echo "exit") | telnet 192.168.1.100 9090
Trying 192.168.1.100...
Connected to 192.168.1.100.
Escape character is '^]'.
players 0 count%3A2 playerindex%3A0 playerid%3A00%3A00%3A00%3A00%3A00%3A14
uuid%3A ip%3A192.168.1.214%3A54500 name%3AWohnzimmer model%3Asqueezelite
isplayer%3A1 displaytype%3Anone canpoweroff%3A1 connected%3A1 playerindex%3A1
playerid%3A90%3Af6%3A52%3A7f%3A24%3A60 uuid%3A ip%3A192.168.1.163%3A39975
name%3ASqueezeLite%203 model%3Asqueezelite isplayer%3A1 displaytype%3Anone
canpoweroff%3A1 connected%3A1
Connection closed by foreign host.
--------------------
Before I left this morning it was
Code:
--------------------
$ (echo "players 0"; sleep 1; echo "exit") | telnet 192.168.1.100 9090
players 0 count%3A3 playerindex%3A0 playerid%3A00%3A00%3A00%3A00%3A00%3A14
uuid%3A ip%3A192.168.1.214%3A50809 name%3AWohnzimmer model%3Asqueezelite
isplayer%3A1 displaytype%3Anone canpoweroff%3A1 connected%3A1 playerindex%3A1
playerid%3A00%3A00%3A00%3A00%3A00%3A12 uuid%3A ip%3A192.168.1.108%3A44683
name%3ABerliner%20Zimmer model%3Asqueezelite isplayer%3A1 displaytype%3Anone
canpoweroff%3A1 connected%3A1 playerindex%3A2
playerid%3A90%3Af6%3A52%3A7f%3A24%3A60 uuid%3A ip%3A192.168.1.163%3A36114
name%3ASqueezeLite%203 model%3Asqueezelite isplayer%3A1 displaytype%3Anone
canpoweroff%3A1 connected%3A1
Connection closed by foreign host.
--------------------
So LMS seems to have completely removed this squeezelite instance.
Thoughts:
- In dmesg, two eth0 disconnects show up. I don't know when they
occured, but can't rule out it was today. But 2 squeezelites survived
this, only one got caught in some kind of race condition (constant
90-100% cpu usage like shown above).
- this scenario is not what I get most times. Now I have a squeezelite
in some kind of race condition (100% cpu), but most of the times the
lost process (from the server side of view) sits at 1-4 % cpu.
------------------------------------------------------------------------
Raspi+MIPS's Profile: http://forums.slimdevices.com/member.php?userid=58448
View this thread: http://forums.slimdevices.com/showthread.php?t=97046
_______________________________________________
unix mailing list
[email protected]
http://lists.slimdevices.com/mailman/listinfo/unix