It's been running for quite some time now (2.7p35). Seems to be working! Great!
2010/4/10 Paul Alfille <[email protected]>: > Can you try 2.7p35? I reworked the owserver locking mechanism. > > 2010/4/6 Patrik Åkerfeldt <[email protected]>: >> Any news about this? I have had my complete 1-wire network shutdown >> for over two weeks now and I would love to get things up and running >> again. If there is information that could be of any help I would be >> happy to assist. >> >> Thanks, >> Patrik >> >> Den 16 mars 2010 21.04 skrev Patrik Åkerfeldt <[email protected]>: >>> The polling is being made from another machine (than that of owserver) using >>> owread in different bash scripts. >>> The scripts are invoked from crontab, some with different intervals. I have >>> now removed all entries in crontab besides the one reading >>> /26.6447E7000000/vis (which seems to be the one causing the problem) but the >>> deadlock haven't yet occurred. >>> It seems that the deadlock only occurs when I have several polling scripts >>> enabled in crontab, one being /vis. >>> >>> /på >>> >>> Den 16 mars 2010 19.04 skrev Patrik Åkerfeldt <[email protected]>: >>>> >>>> Paul, please let me know if you need more feedback. >>>> >>>> Thanks, >>>> >>>> 2010/3/16 Paul Alfille <[email protected]> >>>>> >>>>> Perhaps the mutex is already locked and doesn't unlock. >>>>> >>>>> I'll investigate. >>>>> >>>>> Paul >>>>> >>>>> On Tue, Mar 16, 2010 at 3:47 AM, Christian Magnusson <[email protected]> wrote: >>>>> > I looked breifly on the source, and I can’t find any reason to the >>>>> > hanging >>>>> > lock in ow_read.c:244. >>>>> > >>>>> > It should lock the parsedname, read the value, and unlock just after >>>>> > it’s >>>>> > finished. I can’t see that it could skip this unlock function. >>>>> > >>>>> > >>>>> > >>>>> > I’m afraid I have too much here at work and home to look more into it >>>>> > right >>>>> > now. >>>>> > >>>>> > >>>>> > >>>>> > /Christian >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > From: Patrik Åkerfeldt [mailto:[email protected]] >>>>> > Sent: den 11 mars 2010 18:25 >>>>> > To: OWFS (One-wire file system) discussion and help >>>>> > Subject: Re: [Owfs-developers] owserver stops responding >>>>> > >>>>> > >>>>> > >>>>> > Thanks for the reply, >>>>> > >>>>> > Here's the output from gdb: >>>>> > >>>>> > Thread 3 (Thread 0x473f0940 (LWP 24920)): >>>>> > #0 0x0000003c0300c21f in sem_timedwait () from /lib64/libpthread.so.0 >>>>> > #1 0x00000000004047ee in Handler (file_descriptor=14) at handler.c:249 >>>>> > #2 0x00002ba23185266b in ServerProcessHandler (arg=0x27dc6c0) at >>>>> > ow_net_server.c:158 >>>>> > #3 0x0000003c030062f7 in start_thread () from /lib64/libpthread.so.0 >>>>> > #4 0x0000003c028d1e3d in clone () from /lib64/libc.so.6 >>>>> > >>>>> > Thread 2 (Thread 0x47df1940 (LWP 24921)): >>>>> > #0 0x0000003c0300c888 in __lll_mutex_lock_wait () from >>>>> > /lib64/libpthread.so.0 >>>>> > #1 0x0000003c030088a5 in _L_mutex_lock_107 () from >>>>> > /lib64/libpthread.so.0 >>>>> > #2 0x0000003c03008333 in pthread_mutex_lock () from >>>>> > /lib64/libpthread.so.0 >>>>> > #3 0x00002ba231850466 in LockGet (pn=0x47df1048) at ow_locks.c:189 >>>>> > #4 0x00002ba231858cad in FS_r_given_bus (owq=0x47df1020) at >>>>> > ow_read.c:244 >>>>> > #5 0x00002ba231858fc2 in FS_read_distribute (owq=0x47df1020) at >>>>> > ow_read.c:207 >>>>> > #6 0x00002ba2318597ec in FS_read_postparse (owq=0x47df1020) at >>>>> > ow_read.c:110 >>>>> > #7 0x000000000040332c in ReadHandler (hd=0x473efec0, cm=0x47df10e0, >>>>> > owq=0x47df1020) >>>>> > at read.c:84 >>>>> > #8 0x0000000000403ff7 in DataHandler (v=<value optimized out>) at >>>>> > data.c:133 >>>>> > #9 0x0000003c030062f7 in start_thread () from /lib64/libpthread.so.0 >>>>> > #10 0x0000003c028d1e3d in clone () from /lib64/libc.so.6 >>>>> > >>>>> > Thread 1 (Thread 0x2ba231d322c0 (LWP 14327)): >>>>> > #0 0x0000003c0300dba8 in do_sigwait () from /lib64/libpthread.so.0 >>>>> > #1 0x0000003c0300dc4d in sigwait () from /lib64/libpthread.so.0 >>>>> > #2 0x00002ba231852179 in ServerProcess (HandlerRoutine=0x404350 >>>>> > <Handler>, >>>>> > Exit=0x402050 <ow_exit>) at ow_net_server.c:349 >>>>> > #3 0x000000000040223b in main (argc=8, argv=0x7fff792bf808) at >>>>> > owserver.c:162 >>>>> > >>>>> > I hope this will explain things. >>>>> > -Patrik >>>>> > >>>>> > 2010/3/10 Christian Magnusson <[email protected]> >>>>> > >>>>> > It really seems like a dead-lock in the code + It seems a bit strange >>>>> > that >>>>> > the handler_count inside the brackets are not incremented one by one. >>>>> > >>>>> > This means that you have multiple applications polling the device… I >>>>> > would >>>>> > guess 5-10 shell-scripts filling up the request queue/capacity. >>>>> > >>>>> > >>>>> > >>>>> > # ps –ef | grep owserver >>>>> > >>>>> > Find the process id of owserver. >>>>> > >>>>> > >>>>> > >>>>> > Halt the execution with gdb. >>>>> > >>>>> > # gdb /opt/owfs/bin/owserver –p 12345 >>>>> > >>>>> > (show backtrace for all running threads with gdb) >>>>> > >>>>> > $ thread apply all bt >>>>> > >>>>> > >>>>> > >>>>> > Press return to scroll down all pages and copy the result here. It will >>>>> > show >>>>> > if any thread are hanging in a lock-call. >>>>> > >>>>> > >>>>> > >>>>> > I will be on vacation until Sunday, but I’m sure Paul can trace down >>>>> > the >>>>> > problem if the gdb-output show some obvious hanging. >>>>> > >>>>> > >>>>> > >>>>> > /Christian >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > From: Patrik Åkerfeldt [mailto:[email protected]] >>>>> > Sent: den 10 mars 2010 18:57 >>>>> > To: [email protected] >>>>> > Subject: Re: [Owfs-developers] owserver stops responding >>>>> > >>>>> > >>>>> > >>>>> > I have the same problem with owfs-2.7p31. Although the output is >>>>> > slightly >>>>> > different: >>>>> > >>>>> > DEBUG: handler.c:SingleHandler(317) NOPING handler {165} >>>>> > /26.6447E7000000/temperature >>>>> > CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101 >>>>> > (timeout=100 ms) >>>>> > DEBUG: handler.c:SingleHandler(317) NOPING handler {170} >>>>> > /26.6447E7000000/vis >>>>> > CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101 >>>>> > (timeout=100 ms) >>>>> > DEBUG: handler.c:SingleHandler(317) NOPING handler {153} >>>>> > /26.6447E7000000/temperature >>>>> > CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101 >>>>> > (timeout=100 ms) >>>>> > DEBUG: handler.c:SingleHandler(317) NOPING handler {161} >>>>> > /26.6447E7000000/temperature >>>>> > CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101 >>>>> > (timeout=100 ms) >>>>> > DEBUG: handler.c:SingleHandler(317) NOPING handler {157} >>>>> > /26.6447E7000000/vis >>>>> > CALL: handler.c:SingleHandler(273) sem_timedwait timeout time=0.101 >>>>> > (timeout=100 ms) >>>>> > DEBUG: handler.c:SingleHandler(317) NOPING handler {156} >>>>> > /26.6447E7000000/vis >>>>> > >>>>> > I would really appreciate some help in this matter. >>>>> > -Patrik >>>>> > >>>>> > Den 10 mars 2010 11.49 skrev Patrik Åkerfeldt >>>>> > <[email protected]>: >>>>> > >>>>> > No, wrong conclusion of me. I removed the /temperature polling and let >>>>> > the >>>>> > vis reading remain but it still "hangs". >>>>> > >>>>> > I will try the latest version of owfs instead. >>>>> > >>>>> > >>>>> > >>>>> > -Patrik >>>>> > >>>>> > >>>>> > >>>>> > Den 10 mars 2010 11.46 skrev Patrik Åkerfeldt >>>>> > <[email protected]>: >>>>> > >>>>> > >>>>> > >>>>> > Could it be that I poll the same device (26.6447E7000000) on different >>>>> > readings (temperature and vis) at the same time? And that it some how >>>>> > hangs? >>>>> > >>>>> > >>>>> > >>>>> > Polling is made from bash scripts invoked using crontab at the same >>>>> > intervall. >>>>> > >>>>> > >>>>> > >>>>> > -Patrik >>>>> > >>>>> > >>>>> > >>>>> > Den 9 mars 2010 21.29 skrev Patrik Åkerfeldt >>>>> > <[email protected]>: >>>>> > >>>>> > >>>>> > >>>>> > I've been adding a new device to my 1-wire network that I regularly >>>>> > poll (a >>>>> > solar sensor + temp from hobby-boards). Since then (I think!), owserver >>>>> > seems to stop responding after a while. Sometimes it takes 6-10h before >>>>> > it >>>>> > "hangs" and sometimes just a couple of minutes. >>>>> > >>>>> > The last time I started owserver with --error_level=9 and this is the >>>>> > output >>>>> > when owserver is malfunctioning: >>>>> > >>>>> > DEBUG: to_client.c:ToClient(63) Send delay message >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {43} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(229) PING handler {51} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 >>>>> > offset=0 >>>>> > DEBUG: to_client.c:ToClient(63) Send delay message >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {53} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {55} >>>>> > /26.6447E7000000/vis >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {60} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {59} >>>>> > /26.6447E7000000/vis >>>>> > DEBUG: handler.c:SingleHandler(229) PING handler {57} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 >>>>> > offset=0 >>>>> > DEBUG: to_client.c:ToClient(63) Send delay message >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {47} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {45} >>>>> > /26.6447E7000000/vis >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {49} >>>>> > /26.6447E7000000/vis >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {43} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {51} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {53} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {55} >>>>> > /26.6447E7000000/vis >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {60} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {59} >>>>> > /26.6447E7000000/vis >>>>> > DEBUG: handler.c:SingleHandler(239) NOPING handler {57} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: handler.c:SingleHandler(229) PING handler {47} >>>>> > /26.6447E7000000/temperature >>>>> > DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 >>>>> > offset=0 >>>>> > >>>>> > I can't tell what's wrong from these messages but perhaps somebody else >>>>> > can? >>>>> > The issue is temporary resolved by restarting owserver. >>>>> > >>>>> > owserver is started this way: /opt/owfs/bin/owserver -u -p 3001 >>>>> > --usb_regulartime --timeout_volatile=0 --foreground --error_level=9 >>>>> > Running owserver from owfs-2.7p26. >>>>> > >>>>> > Thanks, >>>>> > -Patrik >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > __________ Information från ESET NOD32 Antivirus, version av >>>>> > virussignaturdatabas 4933 (20100310) __________ >>>>> > >>>>> > >>>>> > >>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus. >>>>> > >>>>> > >>>>> > >>>>> > http://www.esetscandinavia.com >>>>> > >>>>> > __________ Information från ESET NOD32 Antivirus, version av >>>>> > virussignaturdatabas 4933 (20100310) __________ >>>>> > >>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus. >>>>> > >>>>> > http://www.esetscandinavia.com >>>>> > >>>>> > >>>>> > ------------------------------------------------------------------------------ >>>>> > Download Intel® Parallel Studio Eval >>>>> > Try the new software tools for yourself. Speed compiling, find bugs >>>>> > proactively, and fine-tune applications for parallel performance. >>>>> > See why Intel Parallel Studio got high marks during beta. >>>>> > http://p.sf.net/sfu/intel-sw-dev >>>>> > _______________________________________________ >>>>> > Owfs-developers mailing list >>>>> > [email protected] >>>>> > https://lists.sourceforge.net/lists/listinfo/owfs-developers >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > __________ Information från ESET NOD32 Antivirus, version av >>>>> > virussignaturdatabas 4933 (20100310) __________ >>>>> > >>>>> > >>>>> > >>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus. >>>>> > >>>>> > >>>>> > >>>>> > http://www.esetscandinavia.com >>>>> > >>>>> > __________ Information från ESET NOD32 Antivirus, version av >>>>> > virussignaturdatabas 4947 (20100315) __________ >>>>> > >>>>> > Meddelandet har kontrollerats av ESET NOD32 Antivirus. >>>>> > >>>>> > http://www.esetscandinavia.com >>>>> > >>>>> > >>>>> > ------------------------------------------------------------------------------ >>>>> > Download Intel® Parallel Studio Eval >>>>> > Try the new software tools for yourself. Speed compiling, find bugs >>>>> > proactively, and fine-tune applications for parallel performance. >>>>> > See why Intel Parallel Studio got high marks during beta. >>>>> > http://p.sf.net/sfu/intel-sw-dev >>>>> > _______________________________________________ >>>>> > Owfs-developers mailing list >>>>> > [email protected] >>>>> > https://lists.sourceforge.net/lists/listinfo/owfs-developers >>>>> > >>>>> > >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Download Intel® Parallel Studio Eval >>>>> Try the new software tools for yourself. Speed compiling, find bugs >>>>> proactively, and fine-tune applications for parallel performance. >>>>> See why Intel Parallel Studio got high marks during beta. >>>>> http://p.sf.net/sfu/intel-sw-dev >>>>> _______________________________________________ >>>>> Owfs-developers mailing list >>>>> [email protected] >>>>> https://lists.sourceforge.net/lists/listinfo/owfs-developers >>>> >>> >>> >> >> ------------------------------------------------------------------------------ >> Download Intel® Parallel Studio Eval >> Try the new software tools for yourself. Speed compiling, find bugs >> proactively, and fine-tune applications for parallel performance. >> See why Intel Parallel Studio got high marks during beta. >> http://p.sf.net/sfu/intel-sw-dev >> _______________________________________________ >> Owfs-developers mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/owfs-developers >> > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > _______________________________________________ > Owfs-developers mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/owfs-developers > ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Owfs-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/owfs-developers
