[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
On Sep 29, 3:24 pm, Brian Smith [EMAIL PROTECTED] wrote: Toshio Kuratomi wrote: Graham Dumpleton wrote: As to the HTTP request headers, the RFCs say they are effectively latin-1. Thus, all HTTP_? variables in WSGI environ can only be processed as latin-1 when converting toUnicode. Converting these headers tounicodewill lead to mangled data at times. Let's say that some web app needs to keep track of the referer information for some reason. If the app is referred to fromhttp://localhost/€.html (Euro symbol.html ) and it is encoded as utf-8 on the server then the server will send a header with this sequence of bytes:: Referer http://localhost/%e2%82%ac.html If mod_wsgi assumes latin-1 and converts that intounicode before it hits the app, the app will see this:: Refererhttp://localhost/â%82¬.html No, it will leave it ashttp://localhost/%e2%82%ac.html. It does (or should do) the Latin-1-to-Unicodeconversion before it decodes URL encoding. uhm... you're wrong here. url encoding and decoding operates on bytes. unicode is not bytes. so you can't go from byte string to unicode and then pass it through url decode. Or I suppose you can, but it isn't by any means the opposite of what you did to get the url escaped bytes so it's pretty senseless. Unlike wsgi.input where the application *must* decide how to decode the data, you are trying to do automatic encoding of data in the wsgi server here. This will cause tracebacks on someunicodestring input but not others (which is one of the reasons that people hateunicodehandling in python-2). The tracebacks occur because latin-1 characters are a subset of Unicodecharacters (note that we're not dealing with code-point to byte mapping here, we're dealing with character mapping). So you can always convert latin-1 tounicode. But you can't always convertUnicodeto latin-1 (which is what this automatic conversion would attempt). It's much better for the application layer to always hand mod_wsgi byte types, neverunicode. The HTTP standards mandates Latin-1. Python 3.0 says all strings areUnicode. The encoding/decoding is needed to bridge the gap. Treating the HTTP headers as raw sequences of bytes and requiring Python applications to do their own manual decoding/encoding would not be Pythonic and the Python community wouldn't accept it. I disagree. You are dealing with byte sequences here so you need to call them bytes. This *is* pythonic (as much as you can define that for a type that hasn't existed before :-). Look at the WSGI specification for python-2. It specifies storing the values in str type and not in unicode type and that's accepted by the Python community as Pythonic. This takes care of the problem but is somewhat silly. We're basically using latin-1 as a marshalling format for passing bytes over the wire. So we have to convert theunicodeto bytes as the first step in changingunicodecharacters outside the latin-1 range into bytes that can go over the wire. At that point converting the bytes back tounicode pretending they're latin-1 instead of utf-8 is just an extra step for no reason. Again, I think you are misunderstanding the interaction between URL encoding and character encoding conversion. Mod_wsgi will (should) never do or undo URL-encoding itself for non-ASCII (%80-%FF) sequences. I think that you are misunderstanding the interaction. And I thing that % sequences should definitely be done by mod_wsgi. Ending up with a unicode string containing %encoded sequences is even worse than the other scenarios I described as the application then has to convert from unicode to byte string, unquote the url quoting, and then convert back to unicode. (Although this is alleviated in python3 by the fact that urllib.parse.quote()/unquote() take an encoding argument. So the extra steps are taken care of by the function). It would be much better for mod_wsgi to do the url quoting for the user as converting between bytes and %escape sequences is 100% automatable. This is unlike converting between unicode and a sequence of bytes where something has to decide what the character encoding is. So -- WSGI should take care of %encoding because that's a job for a computer anyway. WSGI should not take care of the byte= unicode conversion because it doesn't know what enconding the bytes are in. I have two files there. Both are named ½ñ.html. (one-half tilde- lowercase-n .html). However one of the filenames is encoded with latin-1 and the other with utf-8. If you switch between character encodings for the web page (firefox3: View::Character Encoding::UTF-8 vs View::Character Encoding::Western (iso 8859-1) ) you'll see that you can make one or the other show its name correctly. Why isn't apache able to display both correctly at the same time? It's because apache doesn't know what the encoding of the filenames are. The filesystem is
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
On Sep 29, 4:33 pm, Graham Dumpleton [EMAIL PROTECTED] wrote: 2008/9/30 Toshio Kuratomi [EMAIL PROTECTED]: For response headers and content, the application can either generate bytes and thus control the encoding, or it will fallback to trying to convert it as latin-1 ifUnicodesupplied, so like wsgi.input, no problem there. Unlike wsgi.input where the application *must* decide how to decode the data, you are trying to do automatic encoding of data in the wsgi server here. This will cause tracebacks on someunicodestring input but not others (which is one of the reasons that people hateunicode handling in python-2). The tracebacks occur because latin-1 characters are a subset ofUnicodecharacters (note that we're not dealing with code-point to byte mapping here, we're dealing with character mapping). So you can always convert latin-1 tounicode. But you can't always convertUnicodeto latin-1 (which is what this automatic conversion would attempt). It's much better for the application layer to always hand mod_wsgi byte types, neverunicode. The amendment page says: When running under Python 3, applications SHOULD produce bytes output and headers So, the ideal situation is that the application would always produce bytes and so it is the application which is supposed to deal with it. That mod_wsgi fallbacks to converting anyUnicodestrings to bytes is a fail safe as dictated by: When running under Python 3, servers and gateways MUST accept strings as application output or headers, under the existing rules (i.e., s.encode('latin-1') must convert the string to bytes without an exception) and is more to protect lazy programmers, plus make it easier to port WSGI applications for Python 2.X. So there's two things here: 1) Maybe I'm misunderstanding some code but I thought mod_wsgi was decoding bytes going out to the app. If that's not the case and mod_wsgi is only handing byte strings to the apps then that's fine. (I note that this interaction isn't specified in the Amendment which goes along with your general feeling on the problems with the WSGI- spec writing process.) 2) pje said that accepting unicode str here would make it easier to port WSGI applications but that's actually not true. In python-2.x, you are only supposed to pass byte strings (py-2.x str) so there's no problems. When those str's are converted to unicode str in py3.x, you have to rewrite your code so you aren't passing non-latin-1 characters. At that point, there's zero incentive to pass a sanitized unicode string to the wsgi server as you had to go through the byte type in order to get there (unless you misunderstand the WSGI spec and think it wants you to send py-3.x str type.) As for protecting lazy programmers... I'd argue that it's much better to throw an exception immediately upon receiving a unicode type rather than waiting until your app starts getting popular and you suddenly have transient errors due to people occassionally submitting data with non-latin-1 characters. In other words, your application is the one who should be dealing with it in the first place if you want to be sure about what is being produced. +100 It only becomes an issue where the WSGI application hasn't done what it really should have done. As long as mod_wsgi is only converting unicode to bytes and not converting bytes to unicode, this is true. -Toshio --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
On Sep 29, 4:38 pm, Graham Dumpleton [EMAIL PROTECTED] wrote: As to this whole discussion, as much as it is interesting there is nothing I can do about it. It really needs to be brought up on the Python WEB-SIG where I originally raised the issue of Python 3.0 support for WSGI. I can only implement what consensus comes out of discussion on Python WEB-SIG in lieu of them not wanting to come out with an official revised specification for WSGI. So I have a couple questions: Do you agree with or disagree with my analysis that byte type is the ideal going in and out of WSGI? Do you agree that pje's argument as to why unicode strings should be accepted is specious? If you agree on those, I'll start a new argument on python-web-sig and see if I can get this changed. There's a high probability that it'll just end with pje and I disagreeing with each other but I'll try my hand as long as someone else who's been implementing WSGI servers thinks that it's the correct approach. Thanks! -Toshio --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: Running roundup directly under mod_wsgi possible?
Graham Dumpleton wrote: Just make sure mod_wsgi is working first by following instructions in: http://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide The just substitute out hello world script with that snippet. Yeah, I'm already running about a dozen vhosts under apache, mod_wsgi, and django... some with pretty complex conf files. Anyway, the tracker name needing to be full path was the problem, so it's working now. Thanks so much for the tip! --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
2008/9/30 Toshio Kuratomi [EMAIL PROTECTED]: On Sep 29, 4:33 pm, Graham Dumpleton [EMAIL PROTECTED] wrote: 2008/9/30 Toshio Kuratomi [EMAIL PROTECTED]: For response headers and content, the application can either generate bytes and thus control the encoding, or it will fallback to trying to convert it as latin-1 ifUnicodesupplied, so like wsgi.input, no problem there. Unlike wsgi.input where the application *must* decide how to decode the data, you are trying to do automatic encoding of data in the wsgi server here. This will cause tracebacks on someunicodestring input but not others (which is one of the reasons that people hateunicode handling in python-2). The tracebacks occur because latin-1 characters are a subset ofUnicodecharacters (note that we're not dealing with code-point to byte mapping here, we're dealing with character mapping). So you can always convert latin-1 tounicode. But you can't always convertUnicodeto latin-1 (which is what this automatic conversion would attempt). It's much better for the application layer to always hand mod_wsgi byte types, neverunicode. The amendment page says: When running under Python 3, applications SHOULD produce bytes output and headers So, the ideal situation is that the application would always produce bytes and so it is the application which is supposed to deal with it. That mod_wsgi fallbacks to converting anyUnicodestrings to bytes is a fail safe as dictated by: When running under Python 3, servers and gateways MUST accept strings as application output or headers, under the existing rules (i.e., s.encode('latin-1') must convert the string to bytes without an exception) and is more to protect lazy programmers, plus make it easier to port WSGI applications for Python 2.X. So there's two things here: 1) Maybe I'm misunderstanding some code but I thought mod_wsgi was decoding bytes going out to the app. If that's not the case and mod_wsgi is only handing byte strings to the apps then that's fine. (I note that this interaction isn't specified in the Amendment which goes along with your general feeling on the problems with the WSGI- spec writing process.) I thought I had made it clear enough and that the proposed amendments were also clear on this. The wsgi.input stream which contains the request content is 'bytes'. Thus it is not touched by mod_wsgi. The amendments say: When running under Python 3, servers MUST make wsgi.input a binary (byte) stream Though amendments do though also say: When running under Python 3, servers MUST provide CGI HTTP variables as strings, decoded from the headers using HTTP standard encodings (i.e. latin-1 + RFC 2047) (Open question: are there any CGI or WSGI variables that should NOT be strings?) Thus, mod_wsgi does however convert the CGI variables (ie., translated HTTP headers) in WSGI environment dictionary, into Unicode strings using latin-1 encoding. As I pointed out there were only a few variables in there which were of concern. Brian has pointed out that request URI has to be ascii characters but there possibly still is an open question there on how encoding of non ascii characters works in practice. We just need to do some actual tests to see what happens and whether there is a problem. Thus we are possibly down to SCRIPT_FILENAME given that it is reflecting a file system path. Again, we just need to do some actual tests to see what happens. Remembering that Apache is going to dictate in the main how things work. 2) pje said that accepting unicode str here would make it easier to port WSGI applications but that's actually not true. In python-2.x, you are only supposed to pass byte strings (py-2.x str) so there's no problems. When those str's are converted to unicode str in py3.x, you have to rewrite your code so you aren't passing non-latin-1 characters. At that point, there's zero incentive to pass a sanitized unicode string to the wsgi server as you had to go through the byte type in order to get there (unless you misunderstand the WSGI spec and think it wants you to send py-3.x str type.) As for protecting lazy programmers... I'd argue that it's much better to throw an exception immediately upon receiving a unicode type rather than waiting until your app starts getting popular and you suddenly have transient errors due to people occassionally submitting data with non-latin-1 characters. My feeling was that fallback to converting to bytes using latin-1 was so that simple applications would still work. For example, the hello world application: def application(environ, start_response): status = '200 OK' output = 'Hello World!' response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output] works in by Python 2.X and 3.0 without change. Larger applications such as Django
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
Can we stop with the mod_wsgi should do this or mod_wsgi should do that. The Apache/mod_wsgi module is just one implementation of the WSGI specification. You need when talking about this to look at the bigger picture and what other implementations exist, plus how they all work and interact with the web server they use. Take CGI for example. If you are using a CGI-WSGI adapter, the WSGI environment will come in through os.environ. If you run Python 3.0 and look at os.environ you will get: Python 3.0rc1 (r30rc1:66499, Sep 18 2008, 21:39:06) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin Type help, copyright, credits or license for more information. import os os.environ['PATH'] '/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/ose/bin:/usr/local/bin:/Users/grahamd/bin' type(os.environ['PATH']) class 'str' So, os.environ already holds values as Unicode string objects and not bytes. Thus there is no chance of them being passed to application as bytes. How they get to become Unicode strings depend on the platform. For Windows it uses: PyUnicode_FromWideChar() So, input is Unicode to begin with. On UNIX boxes it uses: PyUnicode_FromString() which presumably means it uses default system encoding whatever that might be. Anyway, already you are stopped from communicating bytes to WSGI application. One could say that proposed amendments to specification for Python 3.0 don't even consider this case where conversion already done for you. Anyway, I have to leave off for now as have to go home. As I sort of suggest above, keep in mind that the proposed amendments are trying to find a compromise that works for many hosting environments. Thus although you ideally may want bytes everywhere, that may not work in practice. Graham 2008/9/30 Toshio Kuratomi [EMAIL PROTECTED]: On Sep 29, 3:24 pm, Brian Smith [EMAIL PROTECTED] wrote: Toshio Kuratomi wrote: Graham Dumpleton wrote: As to the HTTP request headers, the RFCs say they are effectively latin-1. Thus, all HTTP_? variables in WSGI environ can only be processed as latin-1 when converting toUnicode. Converting these headers tounicodewill lead to mangled data at times. Let's say that some web app needs to keep track of the referer information for some reason. If the app is referred to fromhttp://localhost/€.html (Euro symbol.html ) and it is encoded as utf-8 on the server then the server will send a header with this sequence of bytes:: Referer http://localhost/%e2%82%ac.html If mod_wsgi assumes latin-1 and converts that intounicode before it hits the app, the app will see this:: Refererhttp://localhost/â%82¬.html No, it will leave it ashttp://localhost/%e2%82%ac.html. It does (or should do) the Latin-1-to-Unicodeconversion before it decodes URL encoding. uhm... you're wrong here. url encoding and decoding operates on bytes. unicode is not bytes. so you can't go from byte string to unicode and then pass it through url decode. Or I suppose you can, but it isn't by any means the opposite of what you did to get the url escaped bytes so it's pretty senseless. Unlike wsgi.input where the application *must* decide how to decode the data, you are trying to do automatic encoding of data in the wsgi server here. This will cause tracebacks on someunicodestring input but not others (which is one of the reasons that people hateunicodehandling in python-2). The tracebacks occur because latin-1 characters are a subset of Unicodecharacters (note that we're not dealing with code-point to byte mapping here, we're dealing with character mapping). So you can always convert latin-1 tounicode. But you can't always convertUnicodeto latin-1 (which is what this automatic conversion would attempt). It's much better for the application layer to always hand mod_wsgi byte types, neverunicode. The HTTP standards mandates Latin-1. Python 3.0 says all strings areUnicode. The encoding/decoding is needed to bridge the gap. Treating the HTTP headers as raw sequences of bytes and requiring Python applications to do their own manual decoding/encoding would not be Pythonic and the Python community wouldn't accept it. I disagree. You are dealing with byte sequences here so you need to call them bytes. This *is* pythonic (as much as you can define that for a type that hasn't existed before :-). Look at the WSGI specification for python-2. It specifies storing the values in str type and not in unicode type and that's accepted by the Python community as Pythonic. This takes care of the problem but is somewhat silly. We're basically using latin-1 as a marshalling format for passing bytes over the wire. So we have to convert theunicodeto bytes as the first step in changingunicodecharacters outside the latin-1 range into bytes that can go over the wire. At that point converting the bytes back tounicode pretending they're latin-1 instead of utf-8 is just an extra
[modwsgi] Re: Segmentation fault - premature end of script headers
Today I was not able to start my application as I got segmentation faults constantly. I've attached gdb and that is the result: (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1212216416 (LWP 29850)] PyErr_Occurred () at Python/errors.c:80 80 Python/errors.c: No such file or directory. in Python/errors.c (gdb) bt #0 PyErr_Occurred () at Python/errors.c:80 #1 0x002ce167 in _PyObject_GC_Malloc (basicsize=40) at Modules/ gcmodule.c:1326 #2 0x002ce21c in _PyObject_GC_NewVar (tp=0x3083c0, nitems=7) at Modules/gcmodule.c:1352 #3 0x00267c33 in PyTuple_New (size=7) at Objects/tupleobject.c:68 #4 0x0041cdc0 in ?? () #5 0x0007 in ?? () #6 0x001c in ?? () #7 0xb7beab18 in ?? () #8 0x0041cd4e in ?? () #9 0xb7be9af8 in ?? () #10 0xb758b22c in ?? () #11 0xb7be9b80 in ?? () #12 0x0042fed4 in ?? () #13 0x0042fed4 in ?? () #14 0xb7be9acc in ?? () #15 0xb7be9a2c in ?? () #16 0xfbad8001 in ?? () #17 0xb7be9ca0 in ?? () #18 0xb7be9ca0 in ?? () #19 0xb7be9ca0 in ?? () #20 0xb7be9ca0 in ?? () #21 0x0042fed4 in ?? () #22 0x08098608 in apr_bucket_type_eos () #23 0x09ba7920 in ?? () #24 0x002c in ?? () #25 0x in ?? () #26 0x0413 in ?? () #27 0x in ?? () (gdb) thread apply all bt Thread 4 (Thread -1211159648 (LWP 29848)): #0 0x00ad57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00bb33b1 in ___newselect_nocancel () from /lib/tls/libc.so.6 #2 0x0097826b in apr_sleep (t=29953) at time/unix/time.c:246 #3 0x00a76110 in wsgi_monitor_thread (thd=0x9a93420, data=0x9a92dd0) at mod_wsgi.c:8367 #4 0x0097783c in dummy_worker (opaque=0xfdfe) at threadproc/unix/ thread.c:142 #5 0x00c723cc in start_thread () from /lib/tls/libpthread.so.0 #6 0x00bba96e in clone () from /lib/tls/libc.so.6 Thread 3 (Thread -1211688032 (LWP 29849)): #0 0x00ad57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00bb33b1 in ___newselect_nocancel () from /lib/tls/libc.so.6 #2 0x0097826b in apr_sleep (t=100) at time/unix/time.c:246 #3 0x00a75f6a in wsgi_deadlock_thread (thd=0x9a93440, data=0x9a92dd0) at mod_wsgi.c:8279 #4 0x0097783c in dummy_worker (opaque=0xfdfe) at threadproc/unix/ thread.c:142 #5 0x00c723cc in start_thread () from /lib/tls/libpthread.so.0 #6 0x00bba96e in clone () from /lib/tls/libc.so.6 Thread 2 (Thread -1212216416 (LWP 29850)): #0 PyErr_Occurred () at Python/errors.c:80 #1 0x002ce167 in _PyObject_GC_Malloc (basicsize=40) at Modules/ gcmodule.c:1326 #2 0x002ce21c in _PyObject_GC_NewVar (tp=0x3083c0, nitems=7) at Modules/gcmodule.c:1352 #3 0x00267c33 in PyTuple_New (size=7) at Objects/tupleobject.c:68 #4 0x0041cdc0 in ?? () #5 0x0007 in ?? () #6 0x001c in ?? () #7 0xb7beab18 in ?? () #8 0x0041cd4e in ?? () #9 0xb7be9af8 in ?? () #10 0xb758b22c in ?? () #11 0xb7be9b80 in ?? () #12 0x0042fed4 in ?? () #13 0x0042fed4 in ?? () #14 0xb7be9acc in ?? () #15 0xb7be9a2c in ?? () #16 0xfbad8001 in ?? () #17 0xb7be9ca0 in ?? () #18 0xb7be9ca0 in ?? () #19 0xb7be9ca0 in ?? () #20 0xb7be9ca0 in ?? () #21 0x0042fed4 in ?? () #22 0x08098608 in apr_bucket_type_eos () #23 0x09ba7920 in ?? () #24 0x002c in ?? () #25 0x in ?? () #26 0x0413 in ?? () #27 0x in ?? () ---Type return to continue, or q return to quit--- Thread 1 (Thread -1208453440 (LWP 29847)): #0 0x00ad57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00c787c7 in do_sigwait () from /lib/tls/libpthread.so.0 #2 0x00c7888f in sigwait () from /lib/tls/libpthread.so.0 #3 0x009775ea in apr_signal_thread (signal_handler=0xa75e30 wsgi_check_signal) at threadproc/unix/signals.c:383 #4 0x00a76b61 in wsgi_start_process (p=0x9a0d0a8, daemon=0x9a92dd0) at mod_wsgi.c:8483 #5 0x00a7707a in wsgi_manage_process (reason=0, data=0x9a92dd0, status=11) at mod_wsgi.c:7708 #6 0x009703c8 in apr_proc_other_child_alert (proc=0xbfea8f80, reason=0, status=11) at misc/unix/otherchild.c:115 #7 0x080817ad in ap_mpm_run (_pconf=0x9a0d0a8, plog=0x9a3b160, s=0x9a0ef48) at worker.c:1611 #8 0x08061d9c in main (argc=3, argv=0xbfea90e4) at main.c:730 (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. PyErr_Occurred () at Python/errors.c:80 80 in Python/errors.c (gdb) cont Continuing. Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. (gdb) quit After switching to WSGIApplicationGroup %{GLOBAL} my application started, but I have few more applications on this apache instance so I can't use this kind of setup. Is there anything interesting in the above gdb log? Any other commands that I can use next time? -- Maciej Wisniowski --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at
[modwsgi] Re: Segmentation fault - premature end of script headers
Not particularly useful unfortunately. Next thing would be to determine if crash happens as a result of import WSGI script file itself, or due to call of WSGI application. Thus at head of WSGI script file add: import sys print sys.stderr, START OF WSGI SCRIPT FILE and at end of WSGI script file add: print sys.stderr, END OF WSGI SCRIPT FILE If it isn't crashing at load of WSGI script file, both should appear in Apache error log. If does crash, add more debug output like that to ascertain which module being imported causes it to crash. If that is a big module, then need to recursively work out what module that module imports and do the import at start of WSGI script file and try and narrow down which module causes crash. I can't remember, but will test later, if one can manage to set environment variable to force Python to log all imports. This will help narrow it down quicker. Other option is since works in %{GLOBAL}, once everything imported, iterate over modules in sys.modules and find all that have __file__ referencing a .so file and print that out. That will tell you which C extension modules are being used. Standard ones should be okay, but third party ones would be worth a closer look. More later. Graham 2008/9/30 Pigletto [EMAIL PROTECTED]: Today I was not able to start my application as I got segmentation faults constantly. I've attached gdb and that is the result: (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1212216416 (LWP 29850)] PyErr_Occurred () at Python/errors.c:80 80 Python/errors.c: No such file or directory. in Python/errors.c (gdb) bt #0 PyErr_Occurred () at Python/errors.c:80 #1 0x002ce167 in _PyObject_GC_Malloc (basicsize=40) at Modules/ gcmodule.c:1326 #2 0x002ce21c in _PyObject_GC_NewVar (tp=0x3083c0, nitems=7) at Modules/gcmodule.c:1352 #3 0x00267c33 in PyTuple_New (size=7) at Objects/tupleobject.c:68 #4 0x0041cdc0 in ?? () #5 0x0007 in ?? () #6 0x001c in ?? () #7 0xb7beab18 in ?? () #8 0x0041cd4e in ?? () #9 0xb7be9af8 in ?? () #10 0xb758b22c in ?? () #11 0xb7be9b80 in ?? () #12 0x0042fed4 in ?? () #13 0x0042fed4 in ?? () #14 0xb7be9acc in ?? () #15 0xb7be9a2c in ?? () #16 0xfbad8001 in ?? () #17 0xb7be9ca0 in ?? () #18 0xb7be9ca0 in ?? () #19 0xb7be9ca0 in ?? () #20 0xb7be9ca0 in ?? () #21 0x0042fed4 in ?? () #22 0x08098608 in apr_bucket_type_eos () #23 0x09ba7920 in ?? () #24 0x002c in ?? () #25 0x in ?? () #26 0x0413 in ?? () #27 0x in ?? () (gdb) thread apply all bt Thread 4 (Thread -1211159648 (LWP 29848)): #0 0x00ad57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00bb33b1 in ___newselect_nocancel () from /lib/tls/libc.so.6 #2 0x0097826b in apr_sleep (t=29953) at time/unix/time.c:246 #3 0x00a76110 in wsgi_monitor_thread (thd=0x9a93420, data=0x9a92dd0) at mod_wsgi.c:8367 #4 0x0097783c in dummy_worker (opaque=0xfdfe) at threadproc/unix/ thread.c:142 #5 0x00c723cc in start_thread () from /lib/tls/libpthread.so.0 #6 0x00bba96e in clone () from /lib/tls/libc.so.6 Thread 3 (Thread -1211688032 (LWP 29849)): #0 0x00ad57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00bb33b1 in ___newselect_nocancel () from /lib/tls/libc.so.6 #2 0x0097826b in apr_sleep (t=100) at time/unix/time.c:246 #3 0x00a75f6a in wsgi_deadlock_thread (thd=0x9a93440, data=0x9a92dd0) at mod_wsgi.c:8279 #4 0x0097783c in dummy_worker (opaque=0xfdfe) at threadproc/unix/ thread.c:142 #5 0x00c723cc in start_thread () from /lib/tls/libpthread.so.0 #6 0x00bba96e in clone () from /lib/tls/libc.so.6 Thread 2 (Thread -1212216416 (LWP 29850)): #0 PyErr_Occurred () at Python/errors.c:80 #1 0x002ce167 in _PyObject_GC_Malloc (basicsize=40) at Modules/ gcmodule.c:1326 #2 0x002ce21c in _PyObject_GC_NewVar (tp=0x3083c0, nitems=7) at Modules/gcmodule.c:1352 #3 0x00267c33 in PyTuple_New (size=7) at Objects/tupleobject.c:68 #4 0x0041cdc0 in ?? () #5 0x0007 in ?? () #6 0x001c in ?? () #7 0xb7beab18 in ?? () #8 0x0041cd4e in ?? () #9 0xb7be9af8 in ?? () #10 0xb758b22c in ?? () #11 0xb7be9b80 in ?? () #12 0x0042fed4 in ?? () #13 0x0042fed4 in ?? () #14 0xb7be9acc in ?? () #15 0xb7be9a2c in ?? () #16 0xfbad8001 in ?? () #17 0xb7be9ca0 in ?? () #18 0xb7be9ca0 in ?? () #19 0xb7be9ca0 in ?? () #20 0xb7be9ca0 in ?? () #21 0x0042fed4 in ?? () #22 0x08098608 in apr_bucket_type_eos () #23 0x09ba7920 in ?? () #24 0x002c in ?? () #25 0x in ?? () #26 0x0413 in ?? () #27 0x in ?? () ---Type return to continue, or q return to quit--- Thread 1 (Thread -1208453440 (LWP 29847)): #0 0x00ad57a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x00c787c7 in do_sigwait () from /lib/tls/libpthread.so.0 #2 0x00c7888f in sigwait () from /lib/tls/libpthread.so.0 #3 0x009775ea in apr_signal_thread (signal_handler=0xa75e30
[modwsgi] Re: Segmentation fault - premature end of script headers
2008/9/30 Pigletto [EMAIL PROTECTED]: After switching to WSGIApplicationGroup %{GLOBAL} my application started, but I have few more applications on this apache instance so I can't use this kind of setup. Can you explain to me how WebFaction process/memory limits work? If you don't have issues with number of processes and only overall memory usage, then create a separate daemon process group for each application with it being forced to run in main interpreter of its own process. Thus: VirtualHost *:2867 ServerName my-domain.xyz WSGIDaemonProcess rek-prod-app-1 user=xyz group=xyz processes=2 threads=1 \ maximum-requests=500 inactivity-timeout=7200 stack-size=524288 \ display-name=%{GROUP} WSGIScriptAlias / /home2/(...)/rek_project-1.wsgi Directory /home2/(...)/rek_project-1/ WSGIProcessGroup rek-prod-app-1 WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all /Directory WSGIDaemonProcess rek-prod-app-1 user=xyz group=xyz processes=2 threads=1 \ maximum-requests=500 inactivity-timeout=7200 stack-size=524288 \ display-name=%{GROUP} WSGIScriptAlias /suburl /home2/(...)/rek_project-2.wsgi Directory /home2/(...)/rek_project-2/ WSGIProcessGroup rek-prod-app-2 WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all /Directory /VirtualHost This would end up with similar memory usage, the difference being that the application instances are in separate processes rather than separate sub interpreters of same process. Graham --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
The BaseHTTPRequestHandler in http.server of Python 3.0 also only makes headers available as Unicode (latin-1). headers = [] while True: line = self.rfile.readline() headers.append(line) if line in (b'\r\n', b'\n', b''): break hfile = io.StringIO(b''.join(headers).decode('iso-8859-1')) self.headers = email.parser.Parser(_class=self.MessageClass).parse(hfile) Thus, any WSGI server based on that would have no chance of getting access to headers in byte form. Graham 2008/9/30 Graham Dumpleton [EMAIL PROTECTED]: Can we stop with the mod_wsgi should do this or mod_wsgi should do that. The Apache/mod_wsgi module is just one implementation of the WSGI specification. You need when talking about this to look at the bigger picture and what other implementations exist, plus how they all work and interact with the web server they use. Take CGI for example. If you are using a CGI-WSGI adapter, the WSGI environment will come in through os.environ. If you run Python 3.0 and look at os.environ you will get: Python 3.0rc1 (r30rc1:66499, Sep 18 2008, 21:39:06) [GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin Type help, copyright, credits or license for more information. import os os.environ['PATH'] '/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/ose/bin:/usr/local/bin:/Users/grahamd/bin' type(os.environ['PATH']) class 'str' So, os.environ already holds values as Unicode string objects and not bytes. Thus there is no chance of them being passed to application as bytes. How they get to become Unicode strings depend on the platform. For Windows it uses: PyUnicode_FromWideChar() So, input is Unicode to begin with. On UNIX boxes it uses: PyUnicode_FromString() which presumably means it uses default system encoding whatever that might be. Anyway, already you are stopped from communicating bytes to WSGI application. One could say that proposed amendments to specification for Python 3.0 don't even consider this case where conversion already done for you. Anyway, I have to leave off for now as have to go home. As I sort of suggest above, keep in mind that the proposed amendments are trying to find a compromise that works for many hosting environments. Thus although you ideally may want bytes everywhere, that may not work in practice. Graham 2008/9/30 Toshio Kuratomi [EMAIL PROTECTED]: On Sep 29, 3:24 pm, Brian Smith [EMAIL PROTECTED] wrote: Toshio Kuratomi wrote: Graham Dumpleton wrote: As to the HTTP request headers, the RFCs say they are effectively latin-1. Thus, all HTTP_? variables in WSGI environ can only be processed as latin-1 when converting toUnicode. Converting these headers tounicodewill lead to mangled data at times. Let's say that some web app needs to keep track of the referer information for some reason. If the app is referred to fromhttp://localhost/€.html (Euro symbol.html ) and it is encoded as utf-8 on the server then the server will send a header with this sequence of bytes:: Referer http://localhost/%e2%82%ac.html If mod_wsgi assumes latin-1 and converts that intounicode before it hits the app, the app will see this:: Refererhttp://localhost/â%82¬.html No, it will leave it ashttp://localhost/%e2%82%ac.html. It does (or should do) the Latin-1-to-Unicodeconversion before it decodes URL encoding. uhm... you're wrong here. url encoding and decoding operates on bytes. unicode is not bytes. so you can't go from byte string to unicode and then pass it through url decode. Or I suppose you can, but it isn't by any means the opposite of what you did to get the url escaped bytes so it's pretty senseless. Unlike wsgi.input where the application *must* decide how to decode the data, you are trying to do automatic encoding of data in the wsgi server here. This will cause tracebacks on someunicodestring input but not others (which is one of the reasons that people hateunicodehandling in python-2). The tracebacks occur because latin-1 characters are a subset of Unicodecharacters (note that we're not dealing with code-point to byte mapping here, we're dealing with character mapping). So you can always convert latin-1 tounicode. But you can't always convertUnicodeto latin-1 (which is what this automatic conversion would attempt). It's much better for the application layer to always hand mod_wsgi byte types, neverunicode. The HTTP standards mandates Latin-1. Python 3.0 says all strings areUnicode. The encoding/decoding is needed to bridge the gap. Treating the HTTP headers as raw sequences of bytes and requiring Python applications to do their own manual decoding/encoding would not be Pythonic and the Python community wouldn't accept it. I disagree. You are dealing with byte sequences here so you need to call them bytes. This *is* pythonic (as much as you can define that for a type
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
2008/9/30 Toshio Kuratomi [EMAIL PROTECTED]: On Sep 29, 3:24 pm, Brian Smith [EMAIL PROTECTED] wrote: Toshio Kuratomi wrote: Graham Dumpleton wrote: As to the HTTP request headers, the RFCs say they are effectively latin-1. Thus, all HTTP_? variables in WSGI environ can only be processed as latin-1 when converting toUnicode. Converting these headers tounicodewill lead to mangled data at times. Let's say that some web app needs to keep track of the referer information for some reason. If the app is referred to fromhttp://localhost/€.html (Euro symbol.html ) and it is encoded as utf-8 on the server then the server will send a header with this sequence of bytes:: Referer http://localhost/%e2%82%ac.html If mod_wsgi assumes latin-1 and converts that intounicode before it hits the app, the app will see this:: Refererhttp://localhost/â%82¬.html No, it will leave it ashttp://localhost/%e2%82%ac.html. It does (or should do) the Latin-1-to-Unicodeconversion before it decodes URL encoding. uhm... you're wrong here. url encoding and decoding operates on bytes. unicode is not bytes. so you can't go from byte string to unicode and then pass it through url decode. Or I suppose you can, but it isn't by any means the opposite of what you did to get the url escaped bytes so it's pretty senseless. I tested that url with Firefox and Opera in Linux utf-8 and what happens is that Firefox does what Brian says. But testing Firefox in Windows XP it substitutes € for %80 and IE6 changes € to %e2%82%ac. Unlike wsgi.input where the application *must* decide how to decode the data, you are trying to do automatic encoding of data in the wsgi server here. This will cause tracebacks on someunicodestring input but not others (which is one of the reasons that people hateunicodehandling in python-2). The tracebacks occur because latin-1 characters are a subset of Unicodecharacters (note that we're not dealing with code-point to byte mapping here, we're dealing with character mapping). So you can always convert latin-1 tounicode. But you can't always convertUnicodeto latin-1 (which is what this automatic conversion would attempt). It's much better for the application layer to always hand mod_wsgi byte types, neverunicode. The HTTP standards mandates Latin-1. Python 3.0 says all strings areUnicode. The encoding/decoding is needed to bridge the gap. Treating the HTTP headers as raw sequences of bytes and requiring Python applications to do their own manual decoding/encoding would not be Pythonic and the Python community wouldn't accept it. I disagree. You are dealing with byte sequences here so you need to call them bytes. This *is* pythonic (as much as you can define that for a type that hasn't existed before :-). Look at the WSGI specification for python-2. It specifies storing the values in str type and not in unicode type and that's accepted by the Python community as Pythonic. This takes care of the problem but is somewhat silly. We're basically using latin-1 as a marshalling format for passing bytes over the wire. So we have to convert theunicodeto bytes as the first step in changingunicodecharacters outside the latin-1 range into bytes that can go over the wire. At that point converting the bytes back tounicode pretending they're latin-1 instead of utf-8 is just an extra step for no reason. Again, I think you are misunderstanding the interaction between URL encoding and character encoding conversion. Mod_wsgi will (should) never do or undo URL-encoding itself for non-ASCII (%80-%FF) sequences. I think that you are misunderstanding the interaction. And I thing that % sequences should definitely be done by mod_wsgi. Ending up with a unicode string containing %encoded sequences is even worse than the other scenarios I described as the application then has to convert from unicode to byte string, unquote the url quoting, and then convert back to unicode. (Although this is alleviated in python3 by the fact that urllib.parse.quote()/unquote() take an encoding argument. So the extra steps are taken care of by the function). It would be much better for mod_wsgi to do the url quoting for the user as converting between bytes and %escape sequences is 100% automatable. This is unlike converting between unicode and a sequence of bytes where something has to decide what the character encoding is. So -- WSGI should take care of %encoding because that's a job for a computer anyway. WSGI should not take care of the byte= unicode conversion because it doesn't know what enconding the bytes are in. I have two files there. Both are named ½ñ.html. (one-half tilde- lowercase-n .html). However one of the filenames is encoded with latin-1 and the other with utf-8. If you switch between character encodings for the web page (firefox3: View::Character Encoding::UTF-8 vs View::Character
[modwsgi] Re: Segmentation fault - premature end of script headers
Now, again, my application is working with the same setup as before (without GLOBAL). I don't know why this started without segfault now. Nothing has changed. I have to mention that the issue that caused I was not able to start my application today morning was because my memory was over the limit (before this I was disconnected while gdb'ing my app on another Apache instance and gdb process was hung using too much memory) so webfaction killed my processes. After my processes were killed I had to start everything and I was not albe to make one of my apps running (as you have seen already). So, important thing is that there were no changes in application code and no changes in apache configuration. Currently it works again and I can't do more debugging - it doesn't want to segfault. I've added some print statements as you've suggested but I think that wsgi script was imported properlywhen segmentation fault has occured becouse LoggingMiddleware had written empty oheaders.. and ocontent.. files. Can you explain to me how WebFaction process/memory limits work? There are no limits for number of processes only for memory usage. If you don't have issues with number of processes and only overall memory usage, then create a separate daemon process group for each application with it being forced to run in main interpreter of its own process. Thus: Directory /home2/(...)/rek_project-2/ WSGIProcessGroup rek-prod-app-2 WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all /Directory /VirtualHost This would end up with similar memory usage, the difference being that the application instances are in separate processes rather than separate sub interpreters of same process. OK I'll try this. Strange thing is that I had no segmentation faults for two days (since my previous post), and today morning I've seen them one after one. I think about things like: maximum requests per child setting in apache, something with threading in apache, memcached - was not started while I was trying to start my application, but when I've switched to %{GLOBAL}, memcached was still down and it worked... I had segmentation faults before (with locmem caching, so it is not issue with memcached). AFAIR I saw some segfaults before using django- compress. Maybe this is something nasty in psycopg2. I think about adding print statements to all my middlewares and functions. This thing is really hard to debug especially on the server that is used by real users. Thank you very much for your help so far. -- Maciej Wisniowski --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: Segmentation fault - premature end of script headers
What do you get if you run: ulimit -a Maybe they have some sort of hard memory limits in place and you are hitting that. Graham 2008/9/30 Pigletto [EMAIL PROTECTED]: Now, again, my application is working with the same setup as before (without GLOBAL). I don't know why this started without segfault now. Nothing has changed. I have to mention that the issue that caused I was not able to start my application today morning was because my memory was over the limit (before this I was disconnected while gdb'ing my app on another Apache instance and gdb process was hung using too much memory) so webfaction killed my processes. After my processes were killed I had to start everything and I was not albe to make one of my apps running (as you have seen already). So, important thing is that there were no changes in application code and no changes in apache configuration. Currently it works again and I can't do more debugging - it doesn't want to segfault. I've added some print statements as you've suggested but I think that wsgi script was imported properlywhen segmentation fault has occured becouse LoggingMiddleware had written empty oheaders.. and ocontent.. files. Can you explain to me how WebFaction process/memory limits work? There are no limits for number of processes only for memory usage. If you don't have issues with number of processes and only overall memory usage, then create a separate daemon process group for each application with it being forced to run in main interpreter of its own process. Thus: Directory /home2/(...)/rek_project-2/ WSGIProcessGroup rek-prod-app-2 WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all /Directory /VirtualHost This would end up with similar memory usage, the difference being that the application instances are in separate processes rather than separate sub interpreters of same process. OK I'll try this. Strange thing is that I had no segmentation faults for two days (since my previous post), and today morning I've seen them one after one. I think about things like: maximum requests per child setting in apache, something with threading in apache, memcached - was not started while I was trying to start my application, but when I've switched to %{GLOBAL}, memcached was still down and it worked... I had segmentation faults before (with locmem caching, so it is not issue with memcached). AFAIR I saw some segfaults before using django- compress. Maybe this is something nasty in psycopg2. I think about adding print statements to all my middlewares and functions. This thing is really hard to debug especially on the server that is used by real users. Thank you very much for your help so far. -- Maciej Wisniowski --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
Toshio Kuratomi wrote: On Sep 29, 3:24 pm, Brian Smith [EMAIL PROTECTED] wrote: Toshio Kuratomi wrote: If mod_wsgi assumes latin-1 and converts that intounicode before it hits the app, the app will see this:: Refererhttp://localhost/â%82¬.html No, it will leave it as http://localhost/%e2%82%ac.html. It does (or should do) the Latin-1-to-Unicodeconversion before it decodes URL encoding. uhm... you're wrong here. url encoding and decoding operates on bytes. unicode is not bytes. so you can't go from byte string to unicode and then pass it through url decode. Original string in Latin-1: http://localhost/%e2%82%ac.html Latin-1 to Unicode: http://localhost/%e2%82%ac.html Since the original Latin-1 string did not contain any non-Latin characters, no codepoint conversions are performed. Or I suppose you can, but it isn't by any means the opposite of what you did to get the url escaped bytes so it's pretty senseless. I made a mistake about the *encoding* (not decoding) order in my previous email. I will correct it below. Again, I think you are misunderstanding the interaction between URL encoding and character encoding conversion. Mod_wsgi will (should) never do or undo URL-encoding itself for non-ASCII (%80-%FF) sequences. I think that you are misunderstanding the interaction. And I thing that % sequences should definitely be done by mod_wsgi. Ending up with a unicode string containing %encoded sequences is even worse than the other scenarios I described as the application then has to convert from unicode to byte string, unquote the url quoting, and then convert back to unicode. mod_wsgi cannot decode all the % sequences in headers because it doesn't know which headers contain URIs and which ones don't; many headers can contain % sequences that don't mean the same thing they mean in URIs. Plus, sometimes (many times) the application needs the encoded URI instead of the IRI form. If you are you talking about things like PATH_INFO, SCRIPT_NAME, and REQUEST_URI, doing URI-IRI conversion on them will break applications like mine that already do their own URI-IRI conversion. I should test to see what WSGI gateways actually do there. It would be much better for mod_wsgi to do the url quoting for the user as converting between bytes and %escape sequences is 100% automatable. This is unlike converting between unicode and a sequence of bytes where something has to decide what the character encoding is. So -- WSGI should take care of %encoding because that's a job for a computer anyway. WSGI should not take care of the byte= unicode conversion because it doesn't know what enconding the bytes are in. mod_wsgi already mangles the URI components too much in SCRIPT_NAME and PATH_INFO (in its defense, it does so because CGI/WSGI require it to for the most part, except for // munging). That is why I fall back to parsing REQUEST_URI myself. Now let's look at the reverse case: Let's say that the application wants to redirect the user to €.html (Euro symbol.html). For that, they have to enter this into the location header:: real_url = '€.html' byte_sequence = real_url.encode('utf-8') marshalled_form = str(byte_sequence, 'latin-1') headers = [('location', marshalled_form)] No, they have to URL-encode mashalled_form into ASCII first, because the Location header holds a URI, and URIs are always ASCII-only. Well... between marshalled_form and HTTP HEADER, there needs to be a url escaping sequence. but whether that needs to happen outside of mod_wsgi or inside is part of what you and I are debating. You do see from your example above why your initial sequence for decoding at the top of the post is wrong, though? Your decoding sequence at the top placed the ASCII escaping between byte_sequence and real_url instead of between marshalled_form and headers. Right, I made two mistakes here. First, it doesn't make sense to URL-encode the string AFTER converting it to Latin-1. Instead, you need to URL-encode the string BEFORE converting it to Latin-1. Then, the string will only have ASCII characters. Secondly, you can encode/decode it using whatever encodings you please before you URL-encode it, because the URI and IRI specifications do not require every %XX sequence to decode to a valid UTF-8 sequence. mod_wsgi's own view of the filesystem encoding doesn't matter in this case. Regards, Brian --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
2008/9/30 Brian Smith [EMAIL PROTECTED]: mod_wsgi receives a sequence of bytes from apache. It transforms those into unicode by pretending that those bytes are latin-1 and sticks them into SCRIPT_NAME. IMO, mod_wsgi should just drop SCRIPT_NAME and all other non-WSGI environ keys except REQUEST_URI (REQUEST_URI is needed to get the raw, un-decoded URI). Did you perhaps mean SCRIPT_FILENAME. The WSGI specification requires SCRIPT_NAME. As to this whole discussion, as much as it is interesting there is nothing I can do about it. It really needs to be brought up on the Python WEB-SIG where I originally raised the issue of Python 3.0 support for WSGI. I can only implement what consensus comes out of discussion on Python WEB-SIG in lieu of them not wanting to come out with an official revised specification for WSGI. Graham --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: Segmentation fault - premature end of script headers
On 30 Wrz, 14:41, Graham Dumpleton [EMAIL PROTECTED] wrote: What do you get if you run: ulimit -a Maybe they have some sort of hard memory limits in place and you are hitting that. Output of ulimit -a is: --- core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 1024 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 200 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited --- AFAIK there is no hard limit at Webfaction. I have 160 MB memory limit but my processes were killed when memory usage was above 220 MB (ups..). Additionaly after every such incident I'm notified by Webfaction about this issue. So other segmentation faults I've seen before are not connected with process killing due to memory problems. One more question as I'm a bit confused about WSGIApplicationGroup directive. So far I was not using this at all. Does this mean that % {GLOBAL} was used implicitly - by default? I only had WSGIProcessGroup directives in use. I've added a lot of printsys.stderr statements into my application and I will try to raise segmentation fault somehow... -- Maciej Wisniowski --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] OS X compile problem
Hi All, I am trying to install modwsgi over my mac. It gave me some error so I updated my Xcode to the latest version however it still doesn't compile. It seems it has a problem with 64-bit: here is the result of ./configure : checking for apxs2... no checking for apxs... /usr/sbin/apxs checking Apache version... 2.2.8 checking for python... /Library/Frameworks/Python.framework/Versions/ Current/bin/python configure: creating ./config.status config.status: creating Makefile and here is the error: /usr/sbin/apxs -c -I/Library/Frameworks/Python.framework/Versions/2.5/ include/python2.5 -DNDEBUG -Wc,'-arch ppc7400' -Wc,'-arch ppc64' - Wc,'-arch i386' -Wc,'-arch x86_64' mod_wsgi.c -arch ppc7400 -arch ppc64 -arch i386 -arch x86_64 -Wl,-F/Library/Frameworks -framework Python -u _PyMac_Error -ldl /usr/share/apr-1/build-1/libtool --silent --mode=compile gcc- DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp -I/usr/include/ apache2 -I/usr/include/apr-1 -I/usr/include/apr-1 -arch ppc7400 - arch ppc64 -arch i386 -arch x86_64 -I/Library/Frameworks/ Python.framework/Versions/2.5/include/python2.5 -DNDEBUG -c -o mod_wsgi.lo mod_wsgi.c touch mod_wsgi.slo In file included from /Library/Frameworks/Python.framework/Versions/ 2.5/include/python2.5/Python.h:57, from mod_wsgi.c:113: /Library/Frameworks/Python.framework/Versions/2.5/include/python2.5/ pyport.h:761:2:In file included from /Library/Frameworks/ Python.framework/Versions/2.5/include/python2.5/Python.h:57 , from mod_wsgi.c:113: /Library/Frameworks/Python.framework/Versions/2.5/include/python2.5/ pyport.h:761:2: error: error: #error #error LONG_BIT definition appears wrong for platform (bad gcc/glibc config?).LONG_BIT definition appears wrong for platform (bad gcc/glibc config?). lipo: can't figure out the architecture type of: /var/folders/Oz/ OzJCk42BHE4eZJoJ1zy2PTI/-Tmp-//cccGDuqg.out apxs:Error: Command failed with rc=65536 Any idea how to solve this problem? Tnx, Arash --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: Segmentation fault - premature end of script headers
I've managed to get segmentation fault (I was just clicking around my application, I forced few reloads of mod_wsgi by changing wsgi script, etc.), and I was able to reproduce this few times. Again, I've connected to it with gdb but this time I've issued command 'share' before 'bt'. Thanks to this I was able to see much more interesting things. WSGI script is executed, processing reaches my function (view in Django) and exception is raised inside the view. Below is long output of gdb. Seems to me that it is psycopg2 issue...? In my code it is like: class OrManager(models.Manager): def latest(self, count=5): latest = cache.get('latest-offers') if latest is None: latest = self.filter(is_active=True).order_by('- date_added')[:count] print sys.stderr, latest # THIS LINE FAILS - real execution of the SQL I wonder whether this issue might be solved by using %{GLOBAL}? GDB session: (...) (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1212707936 (LWP 9463)] PyErr_Occurred () at Python/errors.c:80 80 Python/errors.c: No such file or directory. in Python/errors.c (gdb) bt #0 PyErr_Occurred () at Python/errors.c:80 #1 0x00d65167 in _PyObject_GC_Malloc (basicsize=40) at Modules/ gcmodule.c:1326 #2 0x00d6521c in _PyObject_GC_NewVar (tp=0xd9f3c0, nitems=7) at Modules/gcmodule.c:1352 #3 0x00cfec33 in PyTuple_New (size=7) at Objects/tupleobject.c:68 #4 0x00400dc0 in ?? () #5 0x0007 in ?? () #6 0x0009 in ?? () #7 0xb7b74aa8 in ?? () #8 0x00400d4e in ?? () #9 0x00d95980 in PyExc_IndexError () from /usr/lib/libpython2.5.so. 1.0 #10 0x in ?? () (gdb) share Symbols already loaded for /lib/tls/libm.so.6 Symbols already loaded for /home2/(...)/apache2.2//lib/libaprutil-1.so. 0 Symbols already loaded for /usr/lib/libsqlite3.so.0 Symbols already loaded for /usr/lib/libexpat.so.0 Symbols already loaded for /home2/(...)/apache2.2//lib/libapr-1.so.0 Symbols already loaded for /lib/libuuid.so.1 Symbols already loaded for /lib/tls/librt.so.1 Symbols already loaded for /lib/libcrypt.so.1 Symbols already loaded for /lib/tls/libpthread.so.0 Symbols already loaded for /lib/libdl.so.2 Symbols already loaded for /lib/tls/libc.so.6 Symbols already loaded for /lib/ld-linux.so.2 Symbols already loaded for /lib/libnss_files.so.2 Symbols already loaded for /home2/(...)/apache2.2/modules/mod_wsgi.so Symbols already loaded for /usr/lib/libpython2.5.so.1.0 Symbols already loaded for /lib/libutil.so.1 Symbols already loaded for /home2/(...)/apache2.2/modules/ mod_log_config.so Symbols already loaded for /home2/(...)/apache2.2/modules/ mod_auth_basic.so Symbols already loaded for /home2/(...)/apache2.2/modules/ mod_authz_user.so Symbols already loaded for /home2/(...)/apache2.2/modules/ mod_authz_host.so Symbols already loaded for /home2/(...)/apache2.2/modules/mod_env.so Symbols already loaded for /home2/(...)/modules/mod_alias.so Symbols already loaded for /home2/(...)/modules/mod_auth_tkt.so Symbols already loaded for /home2/(...)/modules/mod_rewrite.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ time.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/time.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ collections.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/collections.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ cStringIO.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/cStringIO.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ strop.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/strop.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ cPickle.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/cPickle.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ _socket.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/_socket.so Reading symbols from /usr/local/lib/python2.5/lib-dynload/ _ssl.so...done. Loaded symbols for /usr/local/lib/python2.5/lib-dynload/_ssl.so Reading symbols from /lib/libssl.so.4...done. Loaded symbols for /lib/libssl.so.4 Reading symbols from /lib/libcrypto.so.4...done. Loaded symbols for /lib/libcrypto.so.4 Reading symbols from /usr/lib/libgssapi_krb5.so.2...done. Loaded symbols for /usr/lib/libgssapi_krb5.so.2 Reading symbols from /usr/lib/libkrb5.so.3...done. Loaded symbols for /usr/lib/libkrb5.so.3 Reading symbols from /lib/libcom_err.so.2...done. Loaded symbols for /lib/libcom_err.so.2 Reading symbols from /usr/lib/libk5crypto.so.3...done. Loaded symbols for /usr/lib/libk5crypto.so.3 Reading symbols from /lib/libresolv.so.2...done. Loaded symbols for /lib/libresolv.so.2 Reading symbols from /usr/lib/libz.so.1...done. Loaded symbols for /usr/lib/libz.so.1 Reading symbols from /usr/local/lib/python2.5/lib-dynload/ operator.so...done. Loaded symbols for
[modwsgi] Re: Segmentation fault - premature end of script headers
Currently I use psycopg2 from the svn - version dated at January 2008. I've just looked at initd.org's svn and I see there is psycopg2-2.0.8 and in change log from march I found: 2008-03-07 James Henstridge [EMAIL PROTECTED] * psycopg/pqpath.c (_pq_fetch_tuples): Don't call Python APIs without holding the GIL. Maybe that is the problem? I'll give a try to newest psycopg2 -- Maciej Wisniowski --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: Segmentation fault - premature end of script headers
On Tue, Sep 30, 2008 at 4:22 PM, Pigletto [EMAIL PROTECTED] wrote: Maybe that is the problem? I'll give a try to newest psycopg2 2.0.8 definitely fixed some segfaults on my end. Brett --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: mod_wsgi on Python 3.0 (was Re: Python 2.6 and migration warnings flag for Python 3.0.)
2008/10/1 Brian Smith [EMAIL PROTECTED]: mod_wsgi already mangles the URI components too much in SCRIPT_NAME and PATH_INFO (in its defense, it does so because CGI/WSGI require it to for the most part, except for // munging). That is why I fall back to parsing REQUEST_URI myself. In my defence I do the leading duplicate slash removal in SCRIPT_NAME because otherwise different major versions of Apache would behave differently. Any duplicate slashes otherwise within the path of SCRIPT_NAME and PATH_INFO are from memory eliminated by Apache itself and not by mod_wsgi. Graham --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---
[modwsgi] Re: OS X compile problem
2008/10/1 Arash Arfaee [EMAIL PROTECTED]: Hi All, I am trying to install modwsgi over my mac. It gave me some error so I updated my Xcode to the latest version however it still doesn't compile. It seems it has a problem with 64-bit: here is the result of ./configure : checking for apxs2... no checking for apxs... /usr/sbin/apxs checking Apache version... 2.2.8 checking for python... /Library/Frameworks/Python.framework/Versions/ Current/bin/python configure: creating ./config.status config.status: creating Makefile and here is the error: /usr/sbin/apxs -c -I/Library/Frameworks/Python.framework/Versions/2.5/ include/python2.5 -DNDEBUG -Wc,'-arch ppc7400' -Wc,'-arch ppc64' - Wc,'-arch i386' -Wc,'-arch x86_64' mod_wsgi.c -arch ppc7400 -arch ppc64 -arch i386 -arch x86_64 -Wl,-F/Library/Frameworks -framework Python -u _PyMac_Error -ldl /usr/share/apr-1/build-1/libtool --silent --mode=compile gcc- DDARWIN -DSIGPROCMASK_SETS_THREAD_MASK -no-cpp-precomp -I/usr/include/ apache2 -I/usr/include/apr-1 -I/usr/include/apr-1 -arch ppc7400 - arch ppc64 -arch i386 -arch x86_64 -I/Library/Frameworks/ Python.framework/Versions/2.5/include/python2.5 -DNDEBUG -c -o mod_wsgi.lo mod_wsgi.c touch mod_wsgi.slo In file included from /Library/Frameworks/Python.framework/Versions/ 2.5/include/python2.5/Python.h:57, from mod_wsgi.c:113: /Library/Frameworks/Python.framework/Versions/2.5/include/python2.5/ pyport.h:761:2:In file included from /Library/Frameworks/ Python.framework/Versions/2.5/include/python2.5/Python.h:57 , from mod_wsgi.c:113: /Library/Frameworks/Python.framework/Versions/2.5/include/python2.5/ pyport.h:761:2: error: error: #error #error LONG_BIT definition appears wrong for platform (bad gcc/glibc config?).LONG_BIT definition appears wrong for platform (bad gcc/glibc config?). lipo: can't figure out the architecture type of: /var/folders/Oz/ OzJCk42BHE4eZJoJ1zy2PTI/-Tmp-//cccGDuqg.out apxs:Error: Command failed with rc=65536 Any idea how to solve this problem? See: http://code.google.com/p/modwsgi/wiki/InstallationOnMacOSX#Non_Universal_Developer_Tools In short, you are using MacPorts Python and it isn't a fully fat version. Alternatively it is because MacPorts gcc is being used and it isn't fully fat. What hardware are you running on, PPC or Intel and 32 or 64 bit chip? Graham --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups modwsgi group. To post to this group, send email to modwsgi@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/modwsgi?hl=en -~--~~~~--~~--~--~---