Hi Paul, I use Python to drive my system, at startup and every hour I walk the 1-wire directory looking for devices, all other times I just use read(). This is all in a try: except: and exceptions are written to a log file, but that gets rotated. I have looked in the current logs and there are no exceptions, but obviously there could have been some in the past.
Is there a better way to look for errors? I assume that at a lower level owfs will try and retry on error, is there an easy way to access this? I have tried looking at /statistics/errors/ but all the values in there seem to stay at zero. Cheers Mick On 29/04/15 12:59, Paul W Panish wrote: > Mick, > > Thanks for the response. The mechanism I've implemented attempts 5 reads > as quickly as the api responds. As each access fails I log the error > message returned by the api as a WARNING and retry. After 5 attempts I > log an ERROR and skip the update. Unfortunately I mistakenly deleted my > log file so I don't have the exact text, but the failure is a 'file not > found' indication. > > I've increased the owfs update interval to 5 seconds to match my desired > system update interval. I can't wait indefinitely for a success as my > queues will start to back up, and eventually overflow. An occasional > error, or even a short string of errors and skipped updates, isn't a > problem, but a systematic error isn't acceptable for a couple of > reasons. First, if I miss an over-temperature indication I won't change > state to increase circulator speed, or enable heat dumping. Second, and > this is of more concern, if the bus corruption causes a write to fail I > may assume I've entered a state (once again circulator speed or heat > dumping) when I have not. Even if this doesn't occur, in a boiler system > the heat inputs are high enough that over-temperature and pressure can > occur very quickly, the result being a blown safety valve with the > accompanying mess. > > Since the low temperature operation has never shown an error or even a > warning, it's clear the alternative may be to move away from the devices > causing problems when heated. > > What I'm wondering is whether the DS18B20's have an inherent > vulnerability at high temperatures, and that system implementation has > to assume a high failure rate under these conditions. Even using > redundancy this would limit the range of applications where you'd want > to use these devices. > > It would be interesting to know if you're seeing the same type of error > returns. It probably just means increasing your level of logging so that > each failed access is indicated. It took me a while to find this since I > was originally only logging failures after all attempts had failed. As a > result I saw system failures only after weeks of operation. Once I > started logging the intermediate warnings it became clear what the > problem was. > > Paul > > Mick Sulley wrote: >> Hi Paul, >> >> I assume all references to temperatures in your mail are degree F, if so >> I am surprised that it causes a problem. I use 27 DS1820's on my system >> which includes 7 measuring solar panels, these can and have gone to over >> 120 degrees C and I have not experienced the problems that you have. >> >> How are you detecting the errors? I poll as fast as I can, which is >> about 15 seconds or so. I log when I get a good read from each device, >> so error detection is really not had a good read for > 45 seconds. I >> have a couple of sensors that I suspect are faulty and fail from time to >> time but the rest are fine and nothing seems to be temperature related. >> >> If you have some other way to log errors I would be happy to try to >> incorporate that into my system to gather more info. >> >> Cheers >> Mick >> >> On 29/04/15 01:04, Paul W Panish wrote: >>> I’m wondering if anyone has information on an issue I’ve been having >>> with DS18B20 temperature sensors. >>> >>> For some time I’ve been developing a wood fired boiler/heating/DHW >>> system controller >>> (https://sourceforge.net/projects/bctl/?source=directory) using the >>> owcapi for all sensing and I/O functionality. My 1-wire network is >>> limited in length and low in device weight. I have two DS18B20 >>> temperature sensors and three Hobbyboards DS2408 based PIO boards on a >>> roughly 50 foot linear topology bus using CAT5e cabling and standard >>> RJ45 connectors for daisy-chaining bus segments and device attachment. >>> The drops to each device are 1 meter or less. I’m providing power and >>> ground through the CAT5e cabling. >>> >>> My problem is that there seems to be a strong temperature dependency for >>> bus read/write errors caused by the DS18B20 sensors. I’ve replaced the >>> sensors a few times with devices purchased at different times and from >>> different vendors to rule out random bad devices. >>> >>> I’m using a polling loop to read the DS18B20’s and PIO inputs at 5 >>> second intervals with a conversion resolution of 10 bits >>> (temperature10). When the system is cold (<140 degrees F) it can go >>> forever (months) with no errors indicated in any device access. However, >>> when I fire the boiler I start seeing access errors (file not found) as >>> the boiler temperature rises above roughly 150 degrees. The error rate >>> increases as temperatures rise to a maximum level of about 185 degrees >>> at which point they are quite severe. >>> >>> The errors are not just on access to the temperature sensors (which are >>> hot), but also on access to the DS2408 devices (which remain at room >>> temperature), though much less frequently. From this I’m deducing that >>> bus timing is changing for the temperature sensors in such a manner that >>> they are corrupting access to other devices. I don’t have a scope so I’m >>> unable to check for slew rates, noise, or reflection problems, however >>> none of these should be affected by device heating (well maybe slew rate…) >>> >>> I’ve implemented a redundant read mechanism (in addition to any >>> redundancy owfs implements), which has made the system usable, but over >>> the long term this is a risky solution. I can tolerate the read errors >>> assuming I get an occasional success, however if a write to a PIO output >>> is dropped the results could be messy. >>> >>> One solution would be to switch to thermocouple sensors for the high >>> temperature components, using the MAX31850 devices, which I’ll do in the >>> absence of any other remedy. However, the temperatures I’m dealing with >>> are all well within the specified limits of the DS18B50 family of >>> devices, so I’m wondering if anyone has had similar experience and could >>> shed some light on the situation. >>> >>> >>> ------------------------------------------------------------------------------ >>> One dashboard for servers and applications across Physical-Virtual-Cloud >>> Widest out-of-the-box monitoring support with 50+ applications >>> Performance metrics, stats and reports that give you Actionable Insights >>> Deep dive visibility with transaction tracing using APM Insight. >>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >>> _______________________________________________ >>> Owfs-developers mailing list >>> Owfs-developers@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/owfs-developers >> >> ------------------------------------------------------------------------------ >> One dashboard for servers and applications across Physical-Virtual-Cloud >> Widest out-of-the-box monitoring support with 50+ applications >> Performance metrics, stats and reports that give you Actionable Insights >> Deep dive visibility with transaction tracing using APM Insight. >> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y >> _______________________________________________ >> Owfs-developers mailing list >> Owfs-developers@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/owfs-developers > ------------------------------------------------------------------------------ > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > _______________________________________________ > Owfs-developers mailing list > Owfs-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/owfs-developers ------------------------------------------------------------------------------ One dashboard for servers and applications across Physical-Virtual-Cloud Widest out-of-the-box monitoring support with 50+ applications Performance metrics, stats and reports that give you Actionable Insights Deep dive visibility with transaction tracing using APM Insight. http://ad.doubleclick.net/ddm/clk/290420510;117567292;y _______________________________________________ Owfs-developers mailing list Owfs-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/owfs-developers