Sorry for boring everybody with this again.
I found that the debugging was turned off during compile time in the debian
packages. - turned it on, and wrote a little script that adds a line "CRASHED
date" to the logfile and starts the owserver daemon again.
Here are all CRASHED dates including 25 lines of per-context from the logfile.
It seems to crash more often now.
two are related to a write attempt to a PIO device (a heater): 10:38:39 and
11:55:21
the commonn ground appears to be the ToClient routine: it crashes right after
(or at the end ) of it, no matter if called from data.c or ping.c
I now switched over to a different machine (running 2.6.30, not 2.6.32 and
still the old owfs-2.7p17 like we used to on this machine also).
Could there be a problem with the writev command? I will do a kernel upgrade on
the machine in question when I am physically at the machine (a few months),
maybe I can set up a test computer here in Potsdam to see if it is reproducible
with a simple setup.
Michael
--000: 55 28 ED A0 42 00 00
<U(..B..>
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: 00
<.>
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: A9
<.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 9 - 0 = 9
DEBUG: ow_transaction.c:BUS_transaction_single(84) select = 0
TRAFFIC OUT <write> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: BE
<.>
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
CRASHED Wed Jun 1 08:56:59 UTC 2011
--
--000: FF FF FF FF FF FF FF FF FF
<.........>
DEBUG: ow_tcp_read.c:tcp_read(64) attempt 9 bytes Time: 5.000000 seconds
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=7
--000: 1E 01 09 FA 7F FF 02
<.......>
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: 10
<.>
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: B1
<.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 9 - 0 = 9
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
CRASHED Wed Jun 1 10:15:02 UTC 2011
--
DEBUG: ow_read.c:FS_read_postparse(98) /28.EDA042000000/temperature return 12
DEBUG: read.c:ReadHandler(88) ReadHandler: FS_read_postparse read on
/28.EDA042000000/temperature return = 12
OWQ OneWireQuery structure of /28.EDA042000000/temperature
OneWireQuery size=12 offset=0, extension=0
Byte buffer OneWireQuery buffer, length=12
--000: 20 20 20 20 20 31 36 2E 35 36 32 35
< 16.5625>
Cleanup = 0006 OneWireQuery I=12 U=12 F=16.5625 Y=12 D=Thu Jan 1
00:00:12 1970
--- OneWireQuery done
DEBUG: read.c:ReadHandler(96) ReadHandler: FS_read_postparse ok size=12
DEBUG: read.c:ReadHandler(107) ReadHandler: To Client cm->payload=12
cm->size=12 cm->offset=0
Byte buffer Data returned to client, length=12
-- NULL buffer
DEBUG: data.c:DataHandler(149) Read message done value=0x8c50178
(nil) free ow_parseobject.c:OWQ_destroy[304] FREE
DEBUG: ow_parsename.c:FS_ParsedName_destroy(59) /28.EDA042000000/temperature
0x8c542c0 free ow_parsename.c:FS_ParsedName_destroy[62] FREE
DEBUG: data.c:DataHandler(197) DataHandler: FS_ParsedName_destroy done
DEBUG: data.c:DataHandler(211) DataHandler: cm.ret=12
DEBUG: to_client.c:ToClient(56) payload=12 size=12, ret=12, sg=0x103 offset=0
TRAFFIC OUT <to server data> file descriptor=12
Byte buffer FD, length=12
--000: 20 20 20 20 20 31 36 2E 35 36 32 35
< 16.5625>
CRASHED Wed Jun 1 10:38:39 UTC 2011
--
--000: ED
<.>
TRAFFIC IN <NETREAD> bus=0 (/dev/ttyr04)
Byte buffer DS9097U, length=1
--000: F7
<.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 25 - 0 = 25
TRAFFIC IN <NETREAD> bus=0 (/dev/ttyr04)
Byte buffer DS9097U, length=1
--000: ED
<.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 25 - 0 = 25
DEBUG: ow_transaction.c:BUS_transaction_single(215) verify = 1
DEBUG: ow_transaction.c:BUS_transaction_single(215) verify = 1
DEBUG: ow_presence.c:CheckThisConnection(245) Presence of 28 2E 80 42 00 00
00 97 NOT found on bus /dev/ttyr04
DEBUG: ow_presence.c:CheckThisConnection(245) Presence of 28 2E 80 42 00 00
00 97 NOT found on bus /dev/ttyr03
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
CRASHED Wed Jun 1 10:58:48 UTC 2011
--
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: A9
<.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 9 - 0 = 9
DEBUG: ow_transaction.c:BUS_transaction_single(84) select = 0
TRAFFIC OUT <write> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: BE
<.>
DEBUG: ow_tcp_read.c:tcp_read(64) attempt 1 bytes Time: 5.000000 seconds
TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
Byte buffer DS9097U, length=1
--000: BE
<.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 1 - 0 = 1
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
CRASHED Wed Jun 1 11:30:19 UTC 2011
--
--016: 2F 74 65 6D 70 65 72 61 74 75 72 65 00
</28.EBA242000000/temperature.>
DEBUG: ow_tcp_read.c:tcp_read(114) read: 29 - 0 = 29
DEBUG: handler.c:SingleHandler(155) START handler /28.EBA242000000/temperature
CALL: data.c:DataHandler(106) DataHandler: parse
path=/28.EBA242000000/temperature
DEBUG: ow_parseobject.c:OWQ_create(164) /28.EBA242000000/temperature
CALL: ow_parsename.c:FS_ParsedName_anywhere(95)
path=[/28.EBA242000000/temperature]
0x9fef2f0 alloc ow_parsename.c:FS_ParsedName_setup[263] MALLOC size=8196
DEBUG: ow_cache.c:Cache_Get_Device(900) Looking for device 28 EB A2 42 00 00
00 98
DEBUG: ow_cache.c:Cache_Get_Common(1056) Search in cache sn 28 EB A2 42 00 00
00 98 pointer=0xb7770d5c index=0 size=4
DEBUG: ow_cache.c:Cache_Get_Common(1072) Value found in cache. Remaining
life: 118 seconds.
DEBUG: ow_presence.c:CheckPresence(75) Found device on bus 0
CALL: data.c:DataHandler(147) Read message
DEBUG: read.c:ReadHandler(55) ReadHandler start
DEBUG: read.c:ReadHandler(61) ReadHandler: From Client sm->payload=29
sm->size=8192 sm->offset=0
0x9feb7c8 alloc ow_parseobject.c:OWQ_allocate_read_buffer[254] MALLOC size=13
DEBUG: read.c:ReadHandler(83) ReadHandler: call FS_read_postparse on
/28.EBA242000000/temperature
DEBUG: ow_read.c:FS_read_postparse(79) /28.EBA242000000/temperature
DEBUG: ow_read.c:FS_read_distribute(199) /28.EBA242000000/temperature
DEBUG: ow_read.c:FS_r_given_bus(233) About to read
</28.EBA242000000/temperature> extension=0 size=12 offset=0
0x9fe8dd8 alloc ow_devicelock.c:DeviceLockGet[92] MALLOC size=36
0x9fe8dd8 free ow_devicelock.c:DeviceLockGet[114] FREE
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
CRASHED Wed Jun 1 11:55:21 UTC 2011
--
DEBUG: ow_read.c:FS_read_postparse(98) /28.EDA042000000/temperature return 12
DEBUG: read.c:ReadHandler(88) ReadHandler: FS_read_postparse read on
/28.EDA042000000/temperature return = 12
OWQ OneWireQuery structure of /28.EDA042000000/temperature
OneWireQuery size=12 offset=0, extension=0
Byte buffer OneWireQuery buffer, length=12
--000: 20 20 20 20 20 20 20 31 36 2E 37 35
< 16.75>
Cleanup = 0006 OneWireQuery I=12 U=12 F=16.75 Y=12 D=Thu Jan 1 00:00:12
1970
--- OneWireQuery done
DEBUG: read.c:ReadHandler(96) ReadHandler: FS_read_postparse ok size=12
DEBUG: read.c:ReadHandler(107) ReadHandler: To Client cm->payload=12
cm->size=12 cm->offset=0
Byte buffer Data returned to client, length=12
-- NULL buffer
DEBUG: data.c:DataHandler(149) Read message done value=0x8eb9ca0
(nil) free ow_parseobject.c:OWQ_destroy[304] FREE
DEBUG: ow_parsename.c:FS_ParsedName_destroy(59) /28.EDA042000000/temperature
0x8ebd1a0 free ow_parsename.c:FS_ParsedName_destroy[62] FREE
DEBUG: data.c:DataHandler(197) DataHandler: FS_ParsedName_destroy done
DEBUG: data.c:DataHandler(211) DataHandler: cm.ret=12
DEBUG: to_client.c:ToClient(56) payload=12 size=12, ret=12, sg=0x103 offset=0
TRAFFIC OUT <to server data> file descriptor=12
Byte buffer FD, length=12
--000: 20 20 20 20 20 20 20 31 36 2E 37 35
< 16.75>
CRASHED Wed Jun 1 12:03:39 UTC 2011
--
DEBUG: ow_cache.c:Cache_Get_Common(1056) Search in cache sn 35 5B D9 1A 00 00
00 85 pointer=0xb77bfd5c index=0 size=4
DEBUG: ow_cache.c:Cache_Get_Common(1072) Value found in cache. Remaining
life: 80 seconds.
DEBUG: ow_presence.c:CheckPresence(75) Found device on bus 2
CALL: data.c:DataHandler(147) Read message
DEBUG: read.c:ReadHandler(55) ReadHandler start
DEBUG: read.c:ReadHandler(61) ReadHandler: From Client sm->payload=25
sm->size=4096 sm->offset=0
0x936b098 alloc ow_parseobject.c:OWQ_allocate_read_buffer[254] MALLOC size=13
DEBUG: read.c:ReadHandler(83) ReadHandler: call FS_read_postparse on
/35.5BD91A000000/vis_avg
DEBUG: ow_read.c:FS_read_postparse(79) /35.5BD91A000000/vis_avg
DEBUG: ow_read.c:FS_read_distribute(199) /35.5BD91A000000/vis_avg
DEBUG: ow_read.c:FS_r_given_bus(233) About to read </35.5BD91A000000/vis_avg>
extension=0 size=12 offset=0
0x936ae88 alloc ow_devicelock.c:DeviceLockGet[92] MALLOC size=36
DEBUG: ow_read.c:adjust_file_size(329) file_length=12 offset=0 size=12
DEBUG: ow_cache.c:Cache_Get(823) 35 5B D9 1A 00 00 00 85 size=8
IsUncachedDir=0
DEBUG: ow_cache.c:Cache_Get_Common(1056) Search in cache sn 35 5B D9 1A 00 00
00 85 pointer=0xb77b5298 index=0 size=8
DEBUG: ow_cache.c:Cache_Get_Common(1088) Value found in cache, but expired by
5 seconds.
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
DEBUG: loop.c:Ping_or_Send(112) Taking too long, send a keep-alive pulse
DEBUG: to_client.c:ToClient(56) payload=-1 size=0, ret=0, sg=0x0 offset=0
DEBUG: to_client.c:ToClient(63) Send delay message
CRASHED Wed Jun 1 12:24:55 UTC 2011
> We had a owserver 2.7p17 running on an old debian server, without problems.
> We read from 3 serial devices (via a MOXA terminal server) using python/ownet
> and java/ownet interfaces. there is only one instance from the java part, and
> one from the python part, so there could be at most two requests at the same
> time.
>
> Everything went fine until we replaced the server with a new one running
> debian squeeze. First we tried owfs 2.8p8, but owserver crashed very
> frequently. Then we moved back to 2.7p17 and also tried 2.7p39: Here owserver
> froze (no answers) but the daemon was still there. Now we moved back to 2.6p9
> and monitor the daemon and restart it when necessary. Not a very stable
> configuration.
>
> The logfile just before it crashed one time yesterday looks like this, I
> think it is strange that the TRAFFIC IN occurs at the moddle of the line just
> before the crash.
>
> Any ideas would be greatly appreciated.
>
> TRAFFIC IN <NETREAD> bus=0 (/dev/ttyr04)
> Byte buffer DS9097U, length=2
> --000: 1F B2
> <..>
> TRAFFIC OUT <to server data> file descriptor=9
> Byte buffer FD, length=0TRAFFIC IN <NETREAD> bus=2 (/dev/ttyr05)
> Byte buffer DS9097U, length=9
> --000: 40 01 C2 FF 7F FF 10 10 DD
> <@........>
> TRAFFIC OUT <to server data> file descriptor=12
> Byte buffer FD, length=12
> --000: 20 20 20 20 20 20 20 20 20 20 32 30
> < 20>
>
>
> Michael
------------------------------------------------------------------------------
Simplify data backup and recovery for your virtual environment with vRanger.
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today.
http://p.sf.net/sfu/quest-sfdev2dev
_______________________________________________
Owfs-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/owfs-developers