frankd wrote: > I am using the script with the two bug-fixes now. I hard-coded the > gateway just to be sure to reduce the number of potential sources of > failure, non-hardcoded is of course ways more elegant in future. > The good news: In most cases the two affected radios did restart the > wifi without me requiring a reboot. Within one week I had to reboot one > radio once - it was without network and rebooting was the only way to > get it back again... Thus overall I have to intervene a lot less. > > Since the radios are located furthest away from my office at home, the > only way to monitor failures for me means synchronizing the entire > apartment. My observations: > if the WIFI of one of the affected radios fails, the synchronization > group is interrupted (maybe 5 second interruption , plays for 5 seconds, > then is interrupted for another second). My interpretation: First > interruption after a certain time of wifi failure, the group continues > without the affected radio and gets interrupted again when the affected > radio joins the affected group again (using Philippe's Sync-Group > Plugin). > > For me it seems as if the time from detecting the problem first to > restarting the network takes a long time, causing these interruptions. > One example with time stamps below. That was the reason why I wanted to > use only 3 loops with 1 second delay each... > It looks like the automatic gateway detection does not have to be overridden, as it seems reliable.
Looking at your logs, what stands out to me is, first of all, the very low signal level ...Link Quality:26/94 Signal level:-69 dBm Noise level:-96 dBm Tx excessive retries:34.... I might have this backwards, but it might be that "Franky24" has a higher signal level than "Franky2". That said, I mistakenly ran a radio with similar levels and it continued to work, albeit with more frequent resets. Also, your "failed" to "reset" time is only 7 seconds vs 19 seconds here, but your "reset" to "up" time is 28 seconds vs 19 seconds here. Perhaps the long up time delay has something to do with low signal strength. The script does not track how many times the radio recovers from a few missed pings. I was giving it and, and other possible lower level mitigating solution, every opportunity to do so before "pulling the rug out" from the network stack and restarting everything. Since the music generally kept playing, it seemed OK with those values. Synchronization adds another level of complexity, and the default values may not be optimum for that use case. I suppose we could add a statistic for number of failed pings and longest ping failure for later transmission to evaluate the time out constants. BTW, there are other ways to monitor failures than synchronization. The TCP logger ncat described in manual.txt seems the easiest, yielding a real time display and saved log of failures. I also use the excellent Nirsoft Wireless Network Watcher and NetworkConnectLog apps for an on-screen real time update (although brief interruptions may not be captured by the latter two). Thanks for the report. ------------------------------------------------------------------------ POMdev's Profile: http://forums.slimdevices.com/member.php?userid=70558 View this thread: http://forums.slimdevices.com/showthread.php?t=109953 _______________________________________________ Radio mailing list [email protected] http://lists.slimdevices.com/mailman/listinfo/radio
