Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
Am 20.02.2013 um 07:02 schrieb gl...@twistedmatrix.com: On Feb 19, 2013, at 1:28 PM, Axel Rau axel@chaos1.de wrote: Hi, the break came from finding and reading Unix Network Programming by Stevens/Fenner/Rudoff. (-; Am 14.02.2013 um 11:37 schrieb Fredrik Unger: From the top of my head I find it strange that just 3 bytes was received. If I remember correctly my tests sent more (with sendmsg). I think, the reason is, that *no* ancillary data is being sent: Like... no ancillary data is being *passed to sendmsg*? Yes. My debug print at the beginning of sendmsg always shows 0. Or no data is being received from recvmsg? Does tracing the relevant processes with something like strace or dtruss confirm this? What process is being debugged here; master, worker, or both? Both, because the unpack error pointed at a platform/compatibility issue in sendmsg.c, which is used by all parts. I will try to get a trace with dtruss. The debug messages which you've added seem like they might be missing some data, since they just truncate after 'return from sendmsg with', rather than printing a string (even an empty one) after. Yes, the code was not finished. should display sendmsg_result, which was always 1. See attached diff. Without ancillary data, no fd can be shipped. The only place where ancillary data could be provided seems to be twext.internet.sendfdport._SubprocessSocket.doWrite. I would be happy for any hint where to dig... Well, hrm. This code works fine on other platforms, so, one thing to check would be to see if previous versions of the FreeBSD kernel have the same issue. What exact versions of everything are you testing? 8.2 and 9.1. Both show same bug. Axel PS: sendmsg.c seems to be very fragile as debug code changes behaviour of the system (i.e. it locks up after trying a http connect and does not reach point of original exception) This is too general of a comment to really be any use in debugging - but, this code is extremely low-level and sensitive to any I/O that happens on its socket. Where are your debug messages going that are disrupting things? To a file on disk. I *really* don't know what you mean by does not reach point of original exception, though. Can you elaborate? With my debug code, the master process dies immediately on connect to the web server. This does not happen, with the unmodified sendmsg.c. -glyph Axel sendmsg.diff Description: Binary data --- PGP-Key:29E99DD6 ☀ +49 151 2300 9283 ☀ computing @ chaos claudius ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
On Feb 20, 2013, at 1:51 AM, Axel Rau axel@chaos1.de wrote: With my debug code, the master process dies immediately on connect to the web server. This does not happen, with the unmodified sendmsg.c. Your patch appears to be backwards. Your patch also appears to introduce its own segfault. I can't see exactly where yet, but 'char ddfile[30]' makes me suspicious; variables like this should always be [PATH_MAX]. You also have no error-checking on fopen(). I can't tell what's going on in the trace you sent, because the 'ps' output at the top has PIDs which don't correspond to the PIDs below. It also appears to be a trace only of the master, not of the worker. Also, a Python traceback from the invocation of sendmsg() would be helpful. The 'ancillary' that your debug code is printing is the Python argument value for 'ancillary', which means that Python code is getting invoked with no ancillary list. There are only two invocations of sendmsg in the whole codebase, and the one that sends the FD is hard-coded to pass a list with one element in it along (in sendfd). sendmsg is also called to report status from the worker to the master, and that's expected to come along with no sendmsg arguments. -glyph ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
On Feb 19, 2013, at 1:28 PM, Axel Rau axel@chaos1.de wrote: Hi, the break came from finding and reading Unix Network Programming by Stevens/Fenner/Rudoff. (-; Am 14.02.2013 um 11:37 schrieb Fredrik Unger: From the top of my head I find it strange that just 3 bytes was received. If I remember correctly my tests sent more (with sendmsg). I think, the reason is, that *no* ancillary data is being sent: Like... no ancillary data is being *passed to sendmsg*? Or no data is being received from recvmsg? Does tracing the relevant processes with something like strace or dtruss confirm this? What process is being debugged here; master, worker, or both? The debug messages which you've added seem like they might be missing some data, since they just truncate after 'return from sendmsg with', rather than printing a string (even an empty one) after. Without ancillary data, no fd can be shipped. The only place where ancillary data could be provided seems to be twext.internet.sendfdport._SubprocessSocket.doWrite. I would be happy for any hint where to dig... Well, hrm. This code works fine on other platforms, so, one thing to check would be to see if previous versions of the FreeBSD kernel have the same issue. What exact versions of everything are you testing? Axel PS: sendmsg.c seems to be very fragile as debug code changes behaviour of the system (i.e. it locks up after trying a http connect and does not reach point of original exception) This is too general of a comment to really be any use in debugging - but, this code is extremely low-level and sensitive to any I/O that happens on its socket. Where are your debug messages going that are disrupting things? I *really* don't know what you mean by does not reach point of original exception, though. Can you elaborate? -glyph ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
Hi, # ls -l /tmp/debug.dat.485* -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48563 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48577 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48578 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48579 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48580 # --- I have no explanation for that. The buffers are not flushed [1], I might have added a manual flush after writes, or that it writes as the threads are exited.. eg you stop the server. If I remember correctly in my case a crashed calendarserver did not terminate the calendarserver. Could have been that a /etc/init.d/calendarserver stop stopped the crashed server and flushed the files. Not sure anymore. [1] http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html (Gnu libc is probably not in FreeBSD :) ) /Fred ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
On 02/14/2013 09:56 AM, Axel Rau wrote: No, shutting down the server does not flush them. I did not recognized in the 1st place that you have no close. Closing them on function return and opening in append mode, in case file exist, on next function entry might be the way to go. I will try fflush(3)... Hi, Yes it was a quick hack, as my bug also broke the test at the time I did most debugging there. At the end we could condense the problem to a small testcase and at the end the bug was found.. For the previous bug the problem was in sendmsg.c, I can not really tell if this is the case here. I just figured that FreeBSD might do something different. You should maybe also log the sending side, so you know if what is sent with sendmsg is received.. if that is all ok, then the problem might be elsewhere.. It is a tricky thing and there might be better methods, I just went for the dirty hack to get some more information after looking at the code. /Fred ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
Hi, Am 14.02.2013 um 09:56 schrieb Axel Rau: I will try fflush(3)... A fflush(ddfp); before the return produced these results: 1. after startup: --- -rw-r--r-- 1 root wheel 64 Feb 14 10:33 debug.dat.63406 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63431 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63432 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63433 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63434 # cat /tmp/debug.dat.63406 recvmsg: fd [27] [0] result [1] recvmsg: fd [25] [0] result [1] # --- 2. After the error occured: --- -rw-r--r-- 1 root wheel 64 Feb 14 10:33 debug.dat.63406 -rw-r--r-- 1 root wheel 56 Feb 14 10:34 debug.dat.63431 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63432 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63433 -rw-r--r-- 1 root wheel0 Feb 14 10:33 debug.dat.63434 # cat /tmp/debug.dat.63431 recvmsg: fd [24] [0] result [3] recvmsg: value: d 0 0 0 --- Does this enlighten someone? I don't know why there are still empty debug files. Axel --- PGP-Key:29E99DD6 ☀ +49 151 2300 9283 ☀ computing @ chaos claudius ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
Hi, Am 14.02.2013 um 10:15 schrieb Fredrik Unger: You should maybe also log the sending side, so you know if what is sent with sendmsg is received. I will try this next... Axel --- PGP-Key:29E99DD6 ☀ +49 151 2300 9283 ☀ computing @ chaos claudius ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
On 02/14/2013 11:37 AM, Fredrik Unger wrote: 2. After the error occured: # cat /tmp/debug.dat.63431 recvmsg: fd [24] [0] result [3] recvmsg: value: d 0 0 0 --- Does this enlighten someone? You could try something like : Index: sendfd.py === --- sendfd.py (revision 10711) +++ sendfd.py (working copy) @@ -15,7 +15,7 @@ # limitations under the License. ## -from struct import pack, unpack +from struct import pack, unpack,calcsize from socket import SOL_SOCKET from twext.python.sendmsg import sendmsg, recvmsg, SCM_RIGHTS @@ -62,5 +62,7 @@ # cmsg_level and cmsg_type really need to be SOL_SOCKET / SCM_RIGHTS, but # since those are the *only* standard values, there's not much point in # checking. +if len(packedFD)!=calcsize(i): +print Error packedFD [%s] (%d) not %d bytes long % (packedFD,len(packedFD),calcsize(ipackedFD,len(packedFD),calcsize(i))) [unpackedFD] = unpack(i, packedFD) return (unpackedFD, data) Not sure if just printing works, might be a better way to log it. Sorry that I can not provide better help. /Fred ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
Hi Fred, thanks a lot for the debug patch. It works with trial: --- # trial test/test_sendmsg.py twext.python.test.test_sendmsg SendmsgTestCase test_roundtrip ... [OK] test_sendSubProcessFD ... [ERROR] test_wrongTypeAncillary ...[OK] === [ERROR] Traceback (most recent call last): Failure: twext.python.test.test_sendmsg.ExitedWithStderr: Traceback (most recent call last): File /usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/python/test/pullpipe.py, line 22, in module fd, description = recvfd(int(sys.argv[1])) File /usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/python/sendfd.py, line 65, in recvfd [unpackedFD] = unpack(i, packedFD) struct.error: unpack requires a string argument of length 4 twext.python.test.test_sendmsg.SendmsgTestCase.test_sendSubProcessFD --- Ran 3 tests in 0.122s FAILED (errors=1, successes=2) # ls -l /tmp/debug.dat* total 2 -rw-r--r-- 1 root wheel 31 Feb 13 23:05 debug.dat.48537 -rw-r--r-- 1 root wheel 55 Feb 13 23:05 debug.dat.48538 # cat /tmp/debug.dat.4853* recvmsg: fd [8] [0] result [d] recvmsg: fd [8] [0] result [5] recvmsg: value: 3 0 0 0 # rm /tmp/debug.dat.4853* --- But produces no output in real life: --- [caldav-0] --- exception caught here --- [caldav-0]File /usr/local/lib/python2.7/site-packages/twisted/internet/kqreactor.py, line 279, in _doWriteOrRead [caldav-0] why = selectable.doRead() [caldav-0]File /usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/internet/sendfdport.py, line 281, in doRead [caldav-0] fd, description = recvfd(self.fd) [caldav-0]File /usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/python/sendfd.py, line 65, in recvfd [caldav-0] [unpackedFD] = unpack(i, packedFD) [caldav-0] struct.error: unpack requires a string argument of length 4 ... # ls -l /tmp/debug.dat.485* -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48563 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48577 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48578 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48579 -rw-r--r-- 1 root wheel 0 Feb 13 23:07 /tmp/debug.dat.48580 # --- I have no explanation for that. Axel Am 13.02.2013 um 20:20 schrieb Fredrik Unger: Hi, Ok. My problems were not with the kqreactor.py so I have not so much experience of that does but . To concentrate on the sendfd.py problem: [caldav-0] File /usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-8.2-RELEASE-p9-amd64.egg/twext/python/sendfd.py, line 65, in recvfd [caldav-0][unpackedFD] = unpack(i, packedFD) [caldav-0]struct.error: unpack requires a string argument of length 4 What I did was to add a debug to file thing in sendmsg.c Attached a small test patch (against current trunk). Patch it, then you need to compile the sendmsg.c module : gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Wformat-security -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python2.7 -c sendmsg.c -o sendmsg.o gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,relro -Wl,-z,relro -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Wformat-security -Werror=format-security -D_FORTIFY_SOURCE=2 sendmsg.o -o sendmsg.so or build whole calendar server. Then run the test trial test/test_sendmsg.py In ls /tmp/debug.dat.PID should be the output.. In my case : recvmsg: fd [8] [0] result [d] Now, you can adapt the things to track your problem and start a real calendar server, at the crash there should be several pid files.. maybe you get some data out of them.. /Fred On 02/13/2013 12:05 AM, Axel Rau wrote: I did not make any progress. I tried - in sendmsg_sendmsg to check for the right type arg (with PyErr_Format, got no error) - to print out the args with PySys_WriteStdout (did not get anything) - I'm not even sure, if this error is related to my problem: test_sendSubProcessFD ... [ERROR] - If I try control_message-cmsg_type = 0;, I get Errno 22 in sendfd, as expected I need more pointers... Axel Am 08.02.2013 um 13:30 schrieb Fredrik Unger: Hi, I found some problem with sendfd.c some time ago. [1] My problem was with 64 bit Debian and
Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports
On Feb 8, 2013, at 4:30 AM, Fredrik Unger f...@tree.se wrote: I found some problem with sendfd.c some time ago. [1] For what it's worth, this issue was fixed a while ago. (Sorry for the slow reply; was on vacation when this arrived.) -g ___ calendarserver-dev mailing list calendarserver-dev@lists.macosforge.org https://lists.macosforge.org/mailman/listinfo/calendarserver-dev