Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-20 Thread Axel Rau

Am 20.02.2013 um 07:02 schrieb gl...@twistedmatrix.com:

 
 On Feb 19, 2013, at 1:28 PM, Axel Rau axel@chaos1.de wrote:
 
 Hi,
 
 the break came from finding and reading Unix Network Programming by 
 Stevens/Fenner/Rudoff. (-;
 
 Am 14.02.2013 um 11:37 schrieb Fredrik Unger:
 
 From the top of my head I find it strange that just 3 bytes was received.
 If I remember correctly my tests sent more (with sendmsg).
 I think, the reason is, that *no* ancillary data is being sent:
 
 Like... no ancillary data is being *passed to sendmsg*?
Yes. My debug print at the beginning of sendmsg always shows 0.
  Or no data is being received from recvmsg?  Does tracing the relevant 
 processes with something like strace or dtruss confirm this?  What process is 
 being debugged here; master, worker, or both?
Both, because the unpack error pointed at a platform/compatibility issue in 
sendmsg.c, which is used by all parts.
I will try to get a trace with dtruss.
 
 The debug messages which you've added seem like they might be missing some 
 data, since they just truncate after 'return from sendmsg with', rather than 
 printing a string (even an empty one) after.
Yes, the code was not finished. should display sendmsg_result, which was always 
1. See attached diff.
 
 Without ancillary data, no fd can be shipped.
 The only place where ancillary data could be provided seems to be
 twext.internet.sendfdport._SubprocessSocket.doWrite.
 I would be happy for any hint where to dig...
 
 Well, hrm.  This code works fine on other platforms, so, one thing to check 
 would be to see if previous versions of the FreeBSD kernel have the same 
 issue.  What exact versions of everything are you testing?
8.2 and 9.1. Both show same bug.
 
 Axel
 PS: sendmsg.c seems to be very fragile as debug code changes
 behaviour of the system (i.e. it locks up after trying a http
 connect and does not reach point of original exception)
 
 This is too general of a comment to really be any use in debugging - but, 
 this code is extremely low-level and sensitive to any I/O that happens on its 
 socket.  Where are your debug messages going that are disrupting things?
To a file on disk.
 
 I *really* don't know what you mean by does not reach point of original 
 exception, though.  Can you elaborate?
With my debug code, the master process dies immediately on connect to the web 
server.
This does not happen, with the unmodified sendmsg.c.

 
 -glyph
 
Axel


sendmsg.diff
Description: Binary data

---
PGP-Key:29E99DD6  ☀ +49 151 2300 9283  ☀ computing @ chaos claudius

___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-20 Thread Glyph

On Feb 20, 2013, at 1:51 AM, Axel Rau axel@chaos1.de wrote:

 With my debug code, the master process dies immediately on connect to the web 
 server.
 This does not happen, with the unmodified sendmsg.c.


Your patch appears to be backwards.

Your patch also appears to introduce its own segfault.  I can't see exactly 
where yet, but 'char ddfile[30]' makes me suspicious; variables like this 
should always be [PATH_MAX].  You also have no error-checking on fopen().

I can't tell what's going on in the trace you sent, because the 'ps' output at 
the top has PIDs which don't correspond to the PIDs below.  It also appears to 
be a trace only of the master, not of the worker.

Also, a Python traceback from the invocation of sendmsg() would be helpful.  
The 'ancillary' that your debug code is printing is the Python argument value 
for 'ancillary', which means that Python code is getting invoked with no 
ancillary list.  There are only two invocations of sendmsg in the whole 
codebase, and the one that sends the FD is hard-coded to pass a list with one 
element in it along (in sendfd).  sendmsg is also called to report status from 
the worker to the master, and that's expected to come along with no sendmsg 
arguments.

-glyph
___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-19 Thread Glyph

On Feb 19, 2013, at 1:28 PM, Axel Rau axel@chaos1.de wrote:

 Hi,
 
 the break came from finding and reading Unix Network Programming by 
 Stevens/Fenner/Rudoff. (-;
 
 Am 14.02.2013 um 11:37 schrieb Fredrik Unger:
 
 From the top of my head I find it strange that just 3 bytes was received.
 If I remember correctly my tests sent more (with sendmsg).
 I think, the reason is, that *no* ancillary data is being sent:

Like... no ancillary data is being *passed to sendmsg*?  Or no data is being 
received from recvmsg?  Does tracing the relevant processes with something like 
strace or dtruss confirm this?  What process is being debugged here; master, 
worker, or both?

The debug messages which you've added seem like they might be missing some 
data, since they just truncate after 'return from sendmsg with', rather than 
printing a string (even an empty one) after.

 Without ancillary data, no fd can be shipped.
 The only place where ancillary data could be provided seems to be
 twext.internet.sendfdport._SubprocessSocket.doWrite.
 I would be happy for any hint where to dig...

Well, hrm.  This code works fine on other platforms, so, one thing to check 
would be to see if previous versions of the FreeBSD kernel have the same issue. 
 What exact versions of everything are you testing?

 Axel
 PS: sendmsg.c seems to be very fragile as debug code changes
 behaviour of the system (i.e. it locks up after trying a http
 connect and does not reach point of original exception)

This is too general of a comment to really be any use in debugging - but, this 
code is extremely low-level and sensitive to any I/O that happens on its 
socket.  Where are your debug messages going that are disrupting things?

I *really* don't know what you mean by does not reach point of original 
exception, though.  Can you elaborate?

-glyph

___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-14 Thread Fredrik Unger

Hi,



# ls -l /tmp/debug.dat.485*
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48563
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48577
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48578
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48579
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48580
#
---
I have no explanation for that.


The buffers are not flushed [1], I might have added a manual flush after 
 writes, or that it writes as the threads are exited.. eg you stop the 
server. If I remember correctly in my case a crashed calendarserver 
did not terminate the calendarserver.

Could have been that a /etc/init.d/calendarserver stop
stopped the crashed server and flushed the files.
Not sure anymore.

[1] 
http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html 
  (Gnu libc is probably not in FreeBSD :) )


/Fred

___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-14 Thread Fredrik Unger

On 02/14/2013 09:56 AM, Axel Rau wrote:


No, shutting down the server does not flush them.
I did not recognized in the 1st place that you have no close.
Closing them on function return and opening in append mode, in case file exist, 
on next function entry might be the way to go.
I will try fflush(3)...


Hi,

Yes it was a quick hack, as my bug also broke the test at the time I did 
most debugging there. At the end we could condense the problem

to a small testcase and at the end the bug was found..

For the previous bug the problem was in sendmsg.c, I can not really tell 
if this is the case here. I just figured that FreeBSD might do something 
different.


You should maybe also log the sending side, so you know if what is sent
with sendmsg is received..  if that is all ok, then the problem might be 
elsewhere..

It is a tricky thing and there might be better methods, I just went
for the dirty hack to get some more information after looking at
the code.

/Fred
___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-14 Thread Axel Rau
Hi,

Am 14.02.2013 um 09:56 schrieb Axel Rau:

 I will try fflush(3)...
A fflush(ddfp); before the return produced these results:
1. after startup:
---
-rw-r--r--  1 root   wheel   64 Feb 14 10:33 debug.dat.63406
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63431
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63432
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63433
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63434
# cat /tmp/debug.dat.63406
recvmsg: fd [27] [0] result [1]
recvmsg: fd [25] [0] result [1]
#
---
2. After the error occured:
---
-rw-r--r--  1 root   wheel   64 Feb 14 10:33 debug.dat.63406
-rw-r--r--  1 root   wheel   56 Feb 14 10:34 debug.dat.63431
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63432
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63433
-rw-r--r--  1 root   wheel0 Feb 14 10:33 debug.dat.63434
# cat /tmp/debug.dat.63431
recvmsg: fd [24] [0] result [3]
recvmsg: value:
d 0 0 0
---
Does this enlighten someone?
I don't know why there are still empty debug files.

Axel
---
PGP-Key:29E99DD6  ☀ +49 151 2300 9283  ☀ computing @ chaos claudius

___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-14 Thread Axel Rau
Hi,

Am 14.02.2013 um 10:15 schrieb Fredrik Unger:

 You should maybe also log the sending side, so you know if what is sent
 with sendmsg is received.
I will try this next...

Axel
---
PGP-Key:29E99DD6  ☀ +49 151 2300 9283  ☀ computing @ chaos claudius

___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-14 Thread Fredrik Unger



On 02/14/2013 11:37 AM, Fredrik Unger wrote:

2. After the error occured:
# cat /tmp/debug.dat.63431
recvmsg: fd [24] [0] result [3]
recvmsg: value:
d 0 0 0
---
Does this enlighten someone?


You could try something like :

Index: sendfd.py
===
--- sendfd.py   (revision 10711)
+++ sendfd.py   (working copy)
@@ -15,7 +15,7 @@
 # limitations under the License.
 ##

-from struct import pack, unpack
+from struct import pack, unpack,calcsize
 from socket import SOL_SOCKET

 from twext.python.sendmsg import sendmsg, recvmsg, SCM_RIGHTS
@@ -62,5 +62,7 @@
 # cmsg_level and cmsg_type really need to be SOL_SOCKET / 
SCM_RIGHTS, but
 # since those are the *only* standard values, there's not much 
point in

 # checking.
+if len(packedFD)!=calcsize(i):
+print Error packedFD [%s] (%d) not %d bytes long % 
(packedFD,len(packedFD),calcsize(ipackedFD,len(packedFD),calcsize(i)))

 [unpackedFD] = unpack(i, packedFD)
 return (unpackedFD, data)


Not sure if just printing works, might be a better way to log it.
Sorry that I can not provide better help.

/Fred
___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev


Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-13 Thread Axel Rau
Hi Fred,

thanks a lot for the debug patch.
It works with trial:
---
# trial test/test_sendmsg.py
twext.python.test.test_sendmsg
  SendmsgTestCase
test_roundtrip ... [OK]
test_sendSubProcessFD ...   [ERROR]
test_wrongTypeAncillary ...[OK]

===
[ERROR]
Traceback (most recent call last):
Failure: twext.python.test.test_sendmsg.ExitedWithStderr: 
Traceback (most recent call last):
  File 
/usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/python/test/pullpipe.py,
 line 22, in module
fd, description = recvfd(int(sys.argv[1]))
  File 
/usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/python/sendfd.py,
 line 65, in recvfd
[unpackedFD] = unpack(i, packedFD)
struct.error: unpack requires a string argument of length 4

twext.python.test.test_sendmsg.SendmsgTestCase.test_sendSubProcessFD
---
Ran 3 tests in 0.122s

FAILED (errors=1, successes=2)
# ls -l /tmp/debug.dat*
total 2
-rw-r--r--  1 root   wheel   31 Feb 13 23:05 debug.dat.48537
-rw-r--r--  1 root   wheel   55 Feb 13 23:05 debug.dat.48538
# cat /tmp/debug.dat.4853*
recvmsg: fd [8] [0] result [d]
recvmsg: fd [8] [0] result [5]
recvmsg: value:
3 0 0 0
# rm /tmp/debug.dat.4853*
---
But produces no output in real life:
---
[caldav-0]  --- exception caught here ---
[caldav-0]File 
/usr/local/lib/python2.7/site-packages/twisted/internet/kqreactor.py, line 
279, in _doWriteOrRead
[caldav-0]  why = selectable.doRead()
[caldav-0]File 
/usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/internet/sendfdport.py,
 line 281, in doRead
[caldav-0]  fd, description = recvfd(self.fd)
[caldav-0]File 
/usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-9.1-RELEASE-amd64.egg/twext/python/sendfd.py,
 line 65, in recvfd
[caldav-0]  [unpackedFD] = unpack(i, packedFD)
[caldav-0]  struct.error: unpack requires a string argument of length 4
...
# ls -l /tmp/debug.dat.485*
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48563
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48577
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48578
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48579
-rw-r--r--  1 root  wheel  0 Feb 13 23:07 /tmp/debug.dat.48580
#
---
I have no explanation for that.
Axel

Am 13.02.2013 um 20:20 schrieb Fredrik Unger:

 Hi,
 
 Ok.
 
 My problems were not with the kqreactor.py so I have not so much experience 
 of that does but .
 
 To concentrate on the sendfd.py problem:
 
 [caldav-0]  File 
 /usr/local/lib/python2.7/site-packages/Calendar_and_Contacts_Server-4.2-py2.7-freebsd-8.2-RELEASE-p9-amd64.egg/twext/python/sendfd.py,
  line 65, in recvfd
 [caldav-0][unpackedFD] = unpack(i, packedFD)
 [caldav-0]struct.error: unpack requires a string argument of
 length 4
 
 What I did was to add a debug to file thing in sendmsg.c
 Attached a small test patch (against current trunk).
 Patch it, then you need to compile the sendmsg.c module :
 
 gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall 
 -Wstrict-prototypes -g -O2 -fstack-protector --param=ssp-buffer-size=4 
 -Wformat -Wformat-security -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC 
 -I/usr/include/python2.7 -c sendmsg.c -o sendmsg.o
 
 
 gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro 
 -Wl,-z,relro -Wl,-z,relro -g -O2 -fstack-protector --param=ssp-buffer-size=4 
 -Wformat -Wformat-security -Werror=format-security -D_FORTIFY_SOURCE=2 
 sendmsg.o -o sendmsg.so
 
 or build whole calendar server.
 
 Then run the test
 
 trial test/test_sendmsg.py
 
 In ls /tmp/debug.dat.PID should be the output..
 In my case :
 recvmsg: fd [8] [0] result [d]
 
 Now, you can adapt the things to track your problem and start
 a real calendar server, at the crash there should be several pid files..
 maybe you get some data out of them..
 
 /Fred
 
 
 
 On 02/13/2013 12:05 AM, Axel Rau wrote:
 I did not make any progress. I tried
 - in sendmsg_sendmsg to check for the right type arg (with PyErr_Format, got 
 no error)
 - to print out the args with PySys_WriteStdout (did not get anything)
 - I'm not even sure, if this error is related to my problem:
 test_sendSubProcessFD ... [ERROR]
 - If I try control_message-cmsg_type = 0;, I get Errno 22 in sendfd, as 
 expected
 
 I need more pointers...
 
 Axel
 
 Am 08.02.2013 um 13:30 schrieb Fredrik Unger:
 
 Hi,
 
 I found some problem with sendfd.c some time ago. [1]
 
 My problem was with 64 bit Debian and 

Re: [CalendarServer-dev] Exception in recvfd --was: Re: continuing work on FreeBSD ports

2013-02-13 Thread Glyph

On Feb 8, 2013, at 4:30 AM, Fredrik Unger f...@tree.se wrote:

 I found some problem with sendfd.c some time ago. [1]

For what it's worth, this issue was fixed a while ago.  (Sorry for the slow 
reply; was on vacation when this arrived.)

-g
___
calendarserver-dev mailing list
calendarserver-dev@lists.macosforge.org
https://lists.macosforge.org/mailman/listinfo/calendarserver-dev