Hi Paul,

I use Python to drive my system, at startup and every hour I walk the 
1-wire directory looking for devices, all other times I just use read(). 
This is all in a try: except: and exceptions are written to a log file, 
but that gets rotated.  I have looked in the current logs and there are 
no exceptions, but obviously there could have been some in the past.

Is there a better way to look for errors?  I assume that at a lower 
level owfs will try and retry on error, is there an easy way to access 
this?  I have tried looking at /statistics/errors/ but all the values in 
there seem to stay at zero.

Cheers
Mick

On 29/04/15 12:59, Paul W Panish wrote:
> Mick,
>
> Thanks for the response. The mechanism I've implemented attempts 5 reads
> as quickly as the api responds. As each access fails I log the error
> message returned by the api as a WARNING and retry. After 5 attempts I
> log an ERROR and skip the update. Unfortunately I mistakenly deleted my
> log file so I don't have the exact text, but the failure is a 'file not
> found' indication.
>
> I've increased the owfs update interval to 5 seconds to match my desired
> system update interval. I can't wait indefinitely for a success as my
> queues will start to back up, and eventually overflow. An occasional
> error, or even a short string of errors and skipped updates, isn't a
> problem, but a systematic error isn't acceptable for a couple of
> reasons. First, if I miss an over-temperature indication I won't change
> state to increase circulator speed, or enable heat dumping. Second, and
> this is of more concern, if the bus corruption causes a write to fail I
> may assume I've entered a state (once again circulator speed or heat
> dumping) when I have not. Even if this doesn't occur, in a boiler system
> the heat inputs are high enough that over-temperature and pressure can
> occur very quickly, the result being a blown safety valve with the
> accompanying mess.
>
> Since the low temperature operation has never shown an error or even a
> warning, it's clear the alternative may be to move away from the devices
> causing problems when heated.
>
> What I'm wondering is whether the DS18B20's have an inherent
> vulnerability at high temperatures, and that system implementation has
> to assume a high failure rate under these conditions. Even using
> redundancy this would limit the range of applications where you'd want
> to use these devices.
>
> It would be interesting to know if you're seeing the same type of error
> returns. It probably just means increasing your level of logging so that
> each failed access is indicated. It took me a while to find this since I
> was originally only logging failures after all attempts had failed. As a
> result I saw system failures only after weeks of operation. Once I
> started logging the intermediate warnings it became clear what the
> problem was.
>
> Paul
>
> Mick Sulley wrote:
>> Hi Paul,
>>
>> I assume all references to temperatures in your mail are degree F, if so
>> I am surprised that it causes a problem.  I use 27 DS1820's on my system
>> which includes 7 measuring solar panels, these can and have gone to over
>> 120 degrees C and I have not experienced the problems that you have.
>>
>> How are you detecting the errors?  I poll as fast as I can, which is
>> about 15 seconds or so.  I log when I get a good read from each device,
>> so error detection is really not had a good read for > 45 seconds.  I
>> have a couple of sensors that I suspect are faulty and fail from time to
>> time but the rest are fine and nothing seems to be temperature related.
>>
>> If you have some other way to log errors I would be happy to try to
>> incorporate that into my system to gather more info.
>>
>> Cheers
>> Mick
>>
>> On 29/04/15 01:04, Paul W Panish wrote:
>>> I’m wondering if anyone has information on an issue I’ve been having
>>> with DS18B20 temperature sensors.
>>>
>>> For some time I’ve been developing a wood fired boiler/heating/DHW
>>> system controller
>>> (https://sourceforge.net/projects/bctl/?source=directory) using the
>>> owcapi for all sensing and I/O functionality. My 1-wire network is
>>> limited in length and low in device weight. I have two DS18B20
>>> temperature sensors and three Hobbyboards DS2408 based PIO boards on a
>>> roughly 50 foot linear topology bus using CAT5e cabling and standard
>>> RJ45 connectors for daisy-chaining bus segments and device attachment.
>>> The drops to each device are 1 meter or less. I’m providing power and
>>> ground through the CAT5e cabling.
>>>
>>> My problem is that there seems to be a strong temperature dependency for
>>> bus read/write errors caused by the DS18B20 sensors. I’ve replaced the
>>> sensors a few times with devices purchased at different times and from
>>> different vendors to rule out random bad devices.
>>>
>>> I’m using a polling loop to read the DS18B20’s and PIO inputs at 5
>>> second intervals with a conversion resolution of 10 bits
>>> (temperature10). When the system is cold (<140 degrees F) it can go
>>> forever (months) with no errors indicated in any device access. However,
>>> when I fire the boiler I start seeing access errors (file not found) as
>>> the boiler temperature rises above roughly 150 degrees. The error rate
>>> increases as temperatures rise to a maximum level of about 185 degrees
>>> at which point they are quite severe.
>>>
>>> The errors are not just on access to the temperature sensors (which are
>>> hot), but also on access to the DS2408 devices (which remain at room
>>> temperature), though much less frequently. From this I’m deducing that
>>> bus timing is changing for the temperature sensors in such a manner that
>>> they are corrupting access to other devices. I don’t have a scope so I’m
>>> unable to check for slew rates, noise, or reflection problems, however
>>> none of these should be affected by device heating (well  maybe slew rate…)
>>>
>>> I’ve implemented a redundant read mechanism (in addition to any
>>> redundancy owfs implements), which has made the system usable, but over
>>> the long term this is a risky solution. I can tolerate the read errors
>>> assuming I get an occasional success, however if a write to a PIO output
>>> is dropped the results could be messy.
>>>
>>> One solution would be to switch to thermocouple sensors for the high
>>> temperature components, using the MAX31850 devices, which I’ll do in the
>>> absence of any other remedy. However, the temperatures I’m dealing with
>>> are all well within the specified limits of the DS18B50 family of
>>> devices, so I’m wondering if anyone has had similar experience and could
>>> shed some light on the situation.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Owfs-developers mailing list
>>> Owfs-developers@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Owfs-developers mailing list
>> Owfs-developers@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Owfs-developers mailing list
> Owfs-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/owfs-developers


------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Owfs-developers mailing list
Owfs-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/owfs-developers

Reply via email to