Mick,

I don't know anything about the Python api, or how it returns errors. I'm 
pretty sure owfs performs retries under the hood, but I assume these are 
invisible to the user. I'm not doing anything more than checking for error 
returns from OW_get() in the c api. It sounds like you're already checking 
this. 

Based on Jan's feedback and what I'm seeing from others, it doesn't look like 
this is a common problem. I'm not sure what I'm going to do to try and track it 
down further. 

Paul W Panish
Mobile: (603) 343-8901

> On Apr 29, 2015, at 18:34, Mick Sulley <m...@sulley.info> wrote:
> 
> Hi Paul,
> 
> I use Python to drive my system, at startup and every hour I walk the 
> 1-wire directory looking for devices, all other times I just use read(). 
> This is all in a try: except: and exceptions are written to a log file, 
> but that gets rotated.  I have looked in the current logs and there are 
> no exceptions, but obviously there could have been some in the past.
> 
> Is there a better way to look for errors?  I assume that at a lower 
> level owfs will try and retry on error, is there an easy way to access 
> this?  I have tried looking at /statistics/errors/ but all the values in 
> there seem to stay at zero.
> 
> Cheers
> Mick
> 
>> On 29/04/15 12:59, Paul W Panish wrote:
>> Mick,
>> 
>> Thanks for the response. The mechanism I've implemented attempts 5 reads
>> as quickly as the api responds. As each access fails I log the error
>> message returned by the api as a WARNING and retry. After 5 attempts I
>> log an ERROR and skip the update. Unfortunately I mistakenly deleted my
>> log file so I don't have the exact text, but the failure is a 'file not
>> found' indication.
>> 
>> I've increased the owfs update interval to 5 seconds to match my desired
>> system update interval. I can't wait indefinitely for a success as my
>> queues will start to back up, and eventually overflow. An occasional
>> error, or even a short string of errors and skipped updates, isn't a
>> problem, but a systematic error isn't acceptable for a couple of
>> reasons. First, if I miss an over-temperature indication I won't change
>> state to increase circulator speed, or enable heat dumping. Second, and
>> this is of more concern, if the bus corruption causes a write to fail I
>> may assume I've entered a state (once again circulator speed or heat
>> dumping) when I have not. Even if this doesn't occur, in a boiler system
>> the heat inputs are high enough that over-temperature and pressure can
>> occur very quickly, the result being a blown safety valve with the
>> accompanying mess.
>> 
>> Since the low temperature operation has never shown an error or even a
>> warning, it's clear the alternative may be to move away from the devices
>> causing problems when heated.
>> 
>> What I'm wondering is whether the DS18B20's have an inherent
>> vulnerability at high temperatures, and that system implementation has
>> to assume a high failure rate under these conditions. Even using
>> redundancy this would limit the range of applications where you'd want
>> to use these devices.
>> 
>> It would be interesting to know if you're seeing the same type of error
>> returns. It probably just means increasing your level of logging so that
>> each failed access is indicated. It took me a while to find this since I
>> was originally only logging failures after all attempts had failed. As a
>> result I saw system failures only after weeks of operation. Once I
>> started logging the intermediate warnings it became clear what the
>> problem was.
>> 
>> Paul
>> 
>> Mick Sulley wrote:
>>> Hi Paul,
>>> 
>>> I assume all references to temperatures in your mail are degree F, if so
>>> I am surprised that it causes a problem.  I use 27 DS1820's on my system
>>> which includes 7 measuring solar panels, these can and have gone to over
>>> 120 degrees C and I have not experienced the problems that you have.
>>> 
>>> How are you detecting the errors?  I poll as fast as I can, which is
>>> about 15 seconds or so.  I log when I get a good read from each device,
>>> so error detection is really not had a good read for > 45 seconds.  I
>>> have a couple of sensors that I suspect are faulty and fail from time to
>>> time but the rest are fine and nothing seems to be temperature related.
>>> 
>>> If you have some other way to log errors I would be happy to try to
>>> incorporate that into my system to gather more info.
>>> 
>>> Cheers
>>> Mick
>>> 
>>>> On 29/04/15 01:04, Paul W Panish wrote:
>>>> I’m wondering if anyone has information on an issue I’ve been having
>>>> with DS18B20 temperature sensors.
>>>> 
>>>> For some time I’ve been developing a wood fired boiler/heating/DHW
>>>> system controller
>>>> (https://sourceforge.net/projects/bctl/?source=directory) using the
>>>> owcapi for all sensing and I/O functionality. My 1-wire network is
>>>> limited in length and low in device weight. I have two DS18B20
>>>> temperature sensors and three Hobbyboards DS2408 based PIO boards on a
>>>> roughly 50 foot linear topology bus using CAT5e cabling and standard
>>>> RJ45 connectors for daisy-chaining bus segments and device attachment.
>>>> The drops to each device are 1 meter or less. I’m providing power and
>>>> ground through the CAT5e cabling.
>>>> 
>>>> My problem is that there seems to be a strong temperature dependency for
>>>> bus read/write errors caused by the DS18B20 sensors. I’ve replaced the
>>>> sensors a few times with devices purchased at different times and from
>>>> different vendors to rule out random bad devices.
>>>> 
>>>> I’m using a polling loop to read the DS18B20’s and PIO inputs at 5
>>>> second intervals with a conversion resolution of 10 bits
>>>> (temperature10). When the system is cold (<140 degrees F) it can go
>>>> forever (months) with no errors indicated in any device access. However,
>>>> when I fire the boiler I start seeing access errors (file not found) as
>>>> the boiler temperature rises above roughly 150 degrees. The error rate
>>>> increases as temperatures rise to a maximum level of about 185 degrees
>>>> at which point they are quite severe.
>>>> 
>>>> The errors are not just on access to the temperature sensors (which are
>>>> hot), but also on access to the DS2408 devices (which remain at room
>>>> temperature), though much less frequently. From this I’m deducing that
>>>> bus timing is changing for the temperature sensors in such a manner that
>>>> they are corrupting access to other devices. I don’t have a scope so I’m
>>>> unable to check for slew rates, noise, or reflection problems, however
>>>> none of these should be affected by device heating (well  maybe slew rate…)
>>>> 
>>>> I’ve implemented a redundant read mechanism (in addition to any
>>>> redundancy owfs implements), which has made the system usable, but over
>>>> the long term this is a risky solution. I can tolerate the read errors
>>>> assuming I get an occasional success, however if a write to a PIO output
>>>> is dropped the results could be messy.
>>>> 
>>>> One solution would be to switch to thermocouple sensors for the high
>>>> temperature components, using the MAX31850 devices, which I’ll do in the
>>>> absence of any other remedy. However, the temperatures I’m dealing with
>>>> are all well within the specified limits of the DS18B50 family of
>>>> devices, so I’m wondering if anyone has had similar experience and could
>>>> shed some light on the situation.
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------------
>>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>>> Widest out-of-the-box monitoring support with 50+ applications
>>>> Performance metrics, stats and reports that give you Actionable Insights
>>>> Deep dive visibility with transaction tracing using APM Insight.
>>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>>> _______________________________________________
>>>> Owfs-developers mailing list
>>>> Owfs-developers@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
>>> 
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Owfs-developers mailing list
>>> Owfs-developers@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Owfs-developers mailing list
>> Owfs-developers@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
> 
> 
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud 
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Owfs-developers mailing list
> Owfs-developers@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/owfs-developers

------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Owfs-developers mailing list
Owfs-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/owfs-developers

Reply via email to