Hi, Ryan. On Tue, Mar 7, 2017 at 11:11 PM, Ryan Tandy <[email protected]> wrote: > Hi Rogério, > > Sorry to hear about this issue. > > Is there enough info in syslog/journal to tell whether it's a graceful > (software-initiated) shutdown or simply the hardware forcing power off?
No, there are no messages, which makes things very hard to debug. In fact, I looked at the code and the daemon could use more messages more being sent to syslog. :) > I was going to mention the temperature thresholds that were adjusted: > > https://anonscm.debian.org/cgit/collab-maint/micro-evtd.git/commit/?id=b6a052b00cf898689dba1dd993037facaa1bf741 > > but if that were the issue, I would expect it to cause problems when > micro-evtd is not running, as well (since in this case nothing will trigger > the fan to spin up). Indeed, the limits only went up (if I didn't miss anything) and my system seems very stable with micro-evtd disabled. Of course, none of the functionalities of it (like pressing the button to turn the unit off) are present then (unless I hold the button for a hard shutdown). > 1-2 days is a very odd time period for it to survive! Indeed, especially when, without micro-evtd or with the old version/"old state of the world" (with respect to both the kernel and earlier versions of micro-evtd) I get a computer that works as long as I don't reboot it or don't have power outages. Oh, I noticed something strange, but I don't know if it is related somehow with micro-evtd not working. For quite some time now (let's say, 2 or 3 years, but I don't remember when it started), whenever I try to see the environment with fw_printenv, I get errors in the kernel log telling me that the NAND has unrecoverable errors: - - - - - - - - (...) [ 22.466531] Adding 396284k swap on /dev/sda3. Priority:-1 extents:1 across:396284k FS [ 23.123208] EXT4-fs (sda1): mounting ext2 file system using the ext4 subsystem [ 23.184435] EXT4-fs (sda1): mounted filesystem without journal. Opts: errors=remount-ro [ 28.634334] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 31.259076] mv643xx_eth_port mv643xx_eth_port.0 eth0: link up, 1000 Mb/s, full duplex, flow control disabled [ 31.268998] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 56.754422] NFSD: starting 90-second grace period (net c058c578) [132923.387111] __nand_correct_data: uncorrectable ECC error [132923.392597] __nand_correct_data: uncorrectable ECC error [132923.398070] __nand_correct_data: uncorrectable ECC error [132923.403519] __nand_correct_data: uncorrectable ECC error [132923.408972] __nand_correct_data: uncorrectable ECC error [132923.414422] __nand_correct_data: uncorrectable ECC error [132923.419876] __nand_correct_data: uncorrectable ECC error [132923.425327] __nand_correct_data: uncorrectable ECC error [132923.431272] __nand_correct_data: uncorrectable ECC error [132923.436729] __nand_correct_data: uncorrectable ECC error [132923.442192] __nand_correct_data: uncorrectable ECC error [132923.447663] __nand_correct_data: uncorrectable ECC error [132923.453113] __nand_correct_data: uncorrectable ECC error [132923.458564] __nand_correct_data: uncorrectable ECC error [132923.464019] __nand_correct_data: uncorrectable ECC error [132923.469468] __nand_correct_data: uncorrectable ECC error - - - - - - - - Note that I only get these errors if I use fw_printenv (I was going to disable the restrictions to reading from /dev/mem and revert the version of micro-evtd to the previous version). I never got these errors with just a regular use of the kurobox, regardless if I use either the old version of micro-evtd or the new version, which leads me to think that the situation above may be unrelated, but I am a layman here. Thanks, -- Rogério Brito : rbrito@{ime.usp.br,gmail.com} : GPG key 4096R/BCFCAAAA http://cynic.cc/blog/ : github.com/rbrito : profiles.google.com/rbrito DebianQA: http://qa.debian.org/developer.php?login=rbrito%40ime.usp.br

