It's been running for quite some time now (2.7p35). Seems to be working! Great!

2010/4/10 Paul Alfille <[email protected]>:
> Can you try 2.7p35? I reworked the owserver locking mechanism.
>
> 2010/4/6 Patrik Åkerfeldt <[email protected]>:
>> Any news about this? I have had my complete 1-wire network shutdown
>> for over two weeks now and I would love to get things up and running
>> again. If there is information that could be of any help I would be
>> happy to assist.
>>
>> Thanks,
>> Patrik
>>
>> Den 16 mars 2010 21.04 skrev Patrik Åkerfeldt <[email protected]>:
>>> The polling is being made from another machine (than that of owserver) using
>>> owread in different bash scripts.
>>> The scripts are invoked from crontab, some with different intervals. I have
>>> now removed all entries in crontab besides the one reading
>>> /26.6447E7000000/vis (which seems to be the one causing the problem) but the
>>> deadlock haven't yet occurred.
>>> It seems that the deadlock only occurs when I have several polling scripts
>>> enabled in crontab, one being /vis.
>>>
>>> /på
>>>
>>> Den 16 mars 2010 19.04 skrev Patrik Åkerfeldt <[email protected]>:
>>>>
>>>> Paul, please let me know if you need more feedback.
>>>>
>>>> Thanks,
>>>>
>>>> 2010/3/16 Paul Alfille <[email protected]>
>>>>>
>>>>> Perhaps the mutex is already locked and doesn't unlock.
>>>>>
>>>>> I'll investigate.
>>>>>
>>>>> Paul
>>>>>
>>>>> On Tue, Mar 16, 2010 at 3:47 AM, Christian Magnusson <[email protected]> wrote:
>>>>> > I looked breifly on the source, and I can’t find any reason to the
>>>>> > hanging
>>>>> > lock in ow_read.c:244.
>>>>> >
>>>>> > It should lock the parsedname, read the value, and unlock just after
>>>>> > it’s
>>>>> > finished. I can’t see that it could skip this unlock function.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I’m afraid I have too much here at work and home to look more into it
>>>>> > right
>>>>> > now.
>>>>> >
>>>>> >
>>>>> >
>>>>> > /Christian
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Patrik Åkerfeldt [mailto:[email protected]]
>>>>> > Sent: den 11 mars 2010 18:25
>>>>> > To: OWFS (One-wire file system) discussion and help
>>>>> > Subject: Re: [Owfs-developers] owserver stops responding
>>>>> >
>>>>> >
>>>>> >
>>>>> > Thanks for the reply,
>>>>> >
>>>>> > Here's the output from gdb:
>>>>> >
>>>>> > Thread 3 (Thread 0x473f0940 (LWP 24920)):
>>>>> > #0  0x0000003c0300c21f in sem_timedwait () from /lib64/libpthread.so.0
>>>>> > #1  0x00000000004047ee in Handler (file_descriptor=14) at handler.c:249
>>>>> > #2  0x00002ba23185266b in ServerProcessHandler (arg=0x27dc6c0) at
>>>>> > ow_net_server.c:158
>>>>> > #3  0x0000003c030062f7 in start_thread () from /lib64/libpthread.so.0
>>>>> > #4  0x0000003c028d1e3d in clone () from /lib64/libc.so.6
>>>>> >
>>>>> > Thread 2 (Thread 0x47df1940 (LWP 24921)):
>>>>> > #0  0x0000003c0300c888 in __lll_mutex_lock_wait () from
>>>>> > /lib64/libpthread.so.0
>>>>> > #1  0x0000003c030088a5 in _L_mutex_lock_107 () from
>>>>> > /lib64/libpthread.so.0
>>>>> > #2  0x0000003c03008333 in pthread_mutex_lock () from
>>>>> > /lib64/libpthread.so.0
>>>>> > #3  0x00002ba231850466 in LockGet (pn=0x47df1048) at ow_locks.c:189
>>>>> > #4  0x00002ba231858cad in FS_r_given_bus (owq=0x47df1020) at
>>>>> > ow_read.c:244
>>>>> > #5  0x00002ba231858fc2 in FS_read_distribute (owq=0x47df1020) at
>>>>> > ow_read.c:207
>>>>> > #6  0x00002ba2318597ec in FS_read_postparse (owq=0x47df1020) at
>>>>> > ow_read.c:110
>>>>> > #7  0x000000000040332c in ReadHandler (hd=0x473efec0, cm=0x47df10e0,
>>>>> > owq=0x47df1020)
>>>>> >     at read.c:84
>>>>> > #8  0x0000000000403ff7 in DataHandler (v=<value optimized out>) at
>>>>> > data.c:133
>>>>> > #9  0x0000003c030062f7 in start_thread () from /lib64/libpthread.so.0
>>>>> > #10 0x0000003c028d1e3d in clone () from /lib64/libc.so.6
>>>>> >
>>>>> > Thread 1 (Thread 0x2ba231d322c0 (LWP 14327)):
>>>>> > #0  0x0000003c0300dba8 in do_sigwait () from /lib64/libpthread.so.0
>>>>> > #1  0x0000003c0300dc4d in sigwait () from /lib64/libpthread.so.0
>>>>> > #2  0x00002ba231852179 in ServerProcess (HandlerRoutine=0x404350
>>>>> > <Handler>,
>>>>> >     Exit=0x402050 <ow_exit>) at ow_net_server.c:349
>>>>> > #3  0x000000000040223b in main (argc=8, argv=0x7fff792bf808) at
>>>>> > owserver.c:162
>>>>> >
>>>>> > I hope this will explain things.
>>>>> > -Patrik
>>>>> >
>>>>> > 2010/3/10 Christian Magnusson <[email protected]>
>>>>> >
>>>>> > It really seems like a dead-lock in the code + It seems a bit strange
>>>>> > that
>>>>> > the handler_count inside the brackets are not incremented one by one.
>>>>> >
>>>>> > This means that you have multiple applications polling the device… I
>>>>> > would
>>>>> > guess 5-10 shell-scripts filling up the request queue/capacity.
>>>>> >
>>>>> >
>>>>> >
>>>>> > # ps –ef | grep owserver
>>>>> >
>>>>> > Find the process id of owserver.
>>>>> >
>>>>> >
>>>>> >
>>>>> > Halt the execution with gdb.
>>>>> >
>>>>> > # gdb /opt/owfs/bin/owserver –p 12345
>>>>> >
>>>>> > (show backtrace for all running threads with gdb)
>>>>> >
>>>>> > $ thread apply all bt
>>>>> >
>>>>> >
>>>>> >
>>>>> > Press return to scroll down all pages and copy the result here. It will
>>>>> > show
>>>>> > if any thread are hanging in a lock-call.
>>>>> >
>>>>> >
>>>>> >
>>>>> > I will be on vacation until Sunday, but I’m sure Paul can trace down
>>>>> > the
>>>>> > problem if the gdb-output show some obvious hanging.
>>>>> >
>>>>> >
>>>>> >
>>>>> > /Christian
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > From: Patrik Åkerfeldt [mailto:[email protected]]
>>>>> > Sent: den 10 mars 2010 18:57
>>>>> > To: [email protected]
>>>>> > Subject: Re: [Owfs-developers] owserver stops responding
>>>>> >
>>>>> >
>>>>> >
>>>>> > I have the same problem with owfs-2.7p31. Although the output is
>>>>> > slightly
>>>>> > different:
>>>>> >
>>>>> >  DEBUG: handler.c:SingleHandler(317) NOPING handler {165}
>>>>> > /26.6447E7000000/temperature
>>>>> >    CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101
>>>>> > (timeout=100 ms)
>>>>> >   DEBUG: handler.c:SingleHandler(317) NOPING handler {170}
>>>>> > /26.6447E7000000/vis
>>>>> >    CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101
>>>>> > (timeout=100 ms)
>>>>> >   DEBUG: handler.c:SingleHandler(317) NOPING handler {153}
>>>>> > /26.6447E7000000/temperature
>>>>> >    CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101
>>>>> > (timeout=100 ms)
>>>>> >   DEBUG: handler.c:SingleHandler(317) NOPING handler {161}
>>>>> > /26.6447E7000000/temperature
>>>>> >    CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101
>>>>> > (timeout=100 ms)
>>>>> >   DEBUG: handler.c:SingleHandler(317) NOPING handler {157}
>>>>> > /26.6447E7000000/vis
>>>>> >    CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101
>>>>> > (timeout=100 ms)
>>>>> >   DEBUG: handler.c:SingleHandler(317) NOPING handler {156}
>>>>> > /26.6447E7000000/vis
>>>>> >
>>>>> > I would really appreciate some help in this matter.
>>>>> > -Patrik
>>>>> >
>>>>> > Den 10 mars 2010 11.49 skrev Patrik Åkerfeldt
>>>>> > <[email protected]>:
>>>>> >
>>>>> > No, wrong conclusion of me. I removed the /temperature polling and let
>>>>> > the
>>>>> > vis reading remain but it still "hangs".
>>>>> >
>>>>> > I will try the latest version of owfs instead.
>>>>> >
>>>>> >
>>>>> >
>>>>> > -Patrik
>>>>> >
>>>>> >
>>>>> >
>>>>> > Den 10 mars 2010 11.46 skrev Patrik Åkerfeldt
>>>>> > <[email protected]>:
>>>>> >
>>>>> >
>>>>> >
>>>>> > Could it be that I poll the same device (26.6447E7000000) on different
>>>>> > readings (temperature and vis) at the same time? And that it some how
>>>>> > hangs?
>>>>> >
>>>>> >
>>>>> >
>>>>> > Polling is made from bash scripts invoked using crontab at the same
>>>>> > intervall.
>>>>> >
>>>>> >
>>>>> >
>>>>> > -Patrik
>>>>> >
>>>>> >
>>>>> >
>>>>> > Den 9 mars 2010 21.29 skrev Patrik Åkerfeldt
>>>>> > <[email protected]>:
>>>>> >
>>>>> >
>>>>> >
>>>>> > I've been adding a new device to my 1-wire network that I regularly
>>>>> > poll (a
>>>>> > solar sensor + temp from hobby-boards). Since then (I think!), owserver
>>>>> > seems to stop responding after a while. Sometimes it takes 6-10h before
>>>>> > it
>>>>> > "hangs" and sometimes just a couple of minutes.
>>>>> >
>>>>> > The last time I started owserver with --error_level=9 and this is the
>>>>> > output
>>>>> > when owserver is malfunctioning:
>>>>> >
>>>>> >   DEBUG: to_client.c:ToClient(63) Send delay message
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {43}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(229) PING handler {51}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0
>>>>> > offset=0
>>>>> >   DEBUG: to_client.c:ToClient(63) Send delay message
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {53}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {55}
>>>>> > /26.6447E7000000/vis
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {60}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {59}
>>>>> > /26.6447E7000000/vis
>>>>> >   DEBUG: handler.c:SingleHandler(229) PING handler {57}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0
>>>>> > offset=0
>>>>> >   DEBUG: to_client.c:ToClient(63) Send delay message
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {47}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {45}
>>>>> > /26.6447E7000000/vis
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {49}
>>>>> > /26.6447E7000000/vis
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {43}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {51}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {53}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {55}
>>>>> > /26.6447E7000000/vis
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {60}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {59}
>>>>> > /26.6447E7000000/vis
>>>>> >   DEBUG: handler.c:SingleHandler(239) NOPING handler {57}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: handler.c:SingleHandler(229) PING handler {47}
>>>>> > /26.6447E7000000/temperature
>>>>> >   DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0
>>>>> > offset=0
>>>>> >
>>>>> > I can't tell what's wrong from these messages but perhaps somebody else
>>>>> > can?
>>>>> > The issue is temporary resolved by restarting owserver.
>>>>> >
>>>>> > owserver is started this way: /opt/owfs/bin/owserver -u -p 3001
>>>>> > --usb_regulartime --timeout_volatile=0 --foreground --error_level=9
>>>>> > Running owserver from owfs-2.7p26.
>>>>> >
>>>>> > Thanks,
>>>>> > -Patrik
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > __________ Information från ESET NOD32 Antivirus, version av
>>>>> > virussignaturdatabas 4933 (20100310) __________
>>>>> >
>>>>> >
>>>>> >
>>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus.
>>>>> >
>>>>> >
>>>>> >
>>>>> > http://www.esetscandinavia.com
>>>>> >
>>>>> > __________ Information från ESET NOD32 Antivirus, version av
>>>>> > virussignaturdatabas 4933 (20100310) __________
>>>>> >
>>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus.
>>>>> >
>>>>> > http://www.esetscandinavia.com
>>>>> >
>>>>> >
>>>>> > ------------------------------------------------------------------------------
>>>>> > Download Intel&#174; Parallel Studio Eval
>>>>> > Try the new software tools for yourself. Speed compiling, find bugs
>>>>> > proactively, and fine-tune applications for parallel performance.
>>>>> > See why Intel Parallel Studio got high marks during beta.
>>>>> > http://p.sf.net/sfu/intel-sw-dev
>>>>> > _______________________________________________
>>>>> > Owfs-developers mailing list
>>>>> > [email protected]
>>>>> > https://lists.sourceforge.net/lists/listinfo/owfs-developers
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > __________ Information från ESET NOD32 Antivirus, version av
>>>>> > virussignaturdatabas 4933 (20100310) __________
>>>>> >
>>>>> >
>>>>> >
>>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus.
>>>>> >
>>>>> >
>>>>> >
>>>>> > http://www.esetscandinavia.com
>>>>> >
>>>>> > __________ Information från ESET NOD32 Antivirus, version av
>>>>> > virussignaturdatabas 4947 (20100315) __________
>>>>> >
>>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus.
>>>>> >
>>>>> > http://www.esetscandinavia.com
>>>>> >
>>>>> >
>>>>> > ------------------------------------------------------------------------------
>>>>> > Download Intel&#174; Parallel Studio Eval
>>>>> > Try the new software tools for yourself. Speed compiling, find bugs
>>>>> > proactively, and fine-tune applications for parallel performance.
>>>>> > See why Intel Parallel Studio got high marks during beta.
>>>>> > http://p.sf.net/sfu/intel-sw-dev
>>>>> > _______________________________________________
>>>>> > Owfs-developers mailing list
>>>>> > [email protected]
>>>>> > https://lists.sourceforge.net/lists/listinfo/owfs-developers
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Download Intel&#174; Parallel Studio Eval
>>>>> Try the new software tools for yourself. Speed compiling, find bugs
>>>>> proactively, and fine-tune applications for parallel performance.
>>>>> See why Intel Parallel Studio got high marks during beta.
>>>>> http://p.sf.net/sfu/intel-sw-dev
>>>>> _______________________________________________
>>>>> Owfs-developers mailing list
>>>>> [email protected]
>>>>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
>>>>
>>>
>>>
>>
>> ------------------------------------------------------------------------------
>> Download Intel&#174; Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> _______________________________________________
>> Owfs-developers mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/owfs-developers
>>
>
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> Owfs-developers mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/owfs-developers
>

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Owfs-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/owfs-developers

Reply via email to