Re: [RELEASE CANDIDATE] libapreq2 2.11

2009-02-17 Thread Bojan Smojver
On Tue, 2009-02-17 at 08:47 +0200, Issac Goldstand wrote:
 I'm all
 for calling 2.11 a dud and restarting with 2.12 

Version numbers are cheap - go for it.

-- 
Bojan



Re: Optimize behaviour of reverse and forward worker

2009-02-17 Thread jean-frederic clere

Jim Jagielski wrote:


On Feb 14, 2009, at 9:09 AM, Ruediger Pluem wrote:


Current we set is_address_reusable to 0 for the reverse and forward
worker. Is this really needed?
IMHO we could reuse the connection if it goes to the same target
(we already check this).

Regards

Rüdiger



For the generic proxy workers yes; if we are sure it
goes to the exact same host we could reuse. The current
impl is from a time when, iirc, we didn't check...


Could we also export a routine to create a worker?

Cheers

Jean-Frederic


Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-17 Thread Mladen Turk

Graham Dumpleton wrote:

2009/2/17 Mladen Turk mt...@apache.org:

Graham Dumpleton wrote:

2009/2/17 Joe Orton jor...@redhat.com:

I did used to perform a dup, but was told that this would cause
problems with file locking. Specifically was told:

I'm getting lost here.  What has file locking got to do with it?  Does
mod_wscgi rely on file locking somehow?

I'm lost as well :)


Consider:

  fd1 = 

  lock(fd1)

  fd2 = dup(fd1)

  close(fd2) # will release the lock under some lock APIs even though
not last reference to underlying file object

  write(fd1) # lock has already been released so not gauranteed that only writer

  close(fd1)

At least that is how I understand it from what is being explained to
me and pointed out in various documentation.

So, if fd2 is the file descriptor created for file bucket in Apache,
if it gets closed before application later wants to write to file
through fd1, then application has lost its exclusive ownership
acquired by way of the lock and something else could have acquired
lock and started modifying it on basis that it has exclusive onwership
at that time.



Well, like said that won't work, neither is portable
(eg, apr_os_file_t is HANDLE on win32)

What you will need is the code that will take the Python
object and invoke Python file api feeding the apr_bucket.
(Basically writing the apr_bucket_python_file).

However the simplest thing might be an intermediate temp file, in
which case httpd could reference the file name not the file
object itself. Not sure how woule that work with dynamic file
since apr and python might use different platform locking
mechanisms.

Regards
--
^(TM)


Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-17 Thread Graham Dumpleton
2009/2/17 Mladen Turk mt...@apache.org:
 Graham Dumpleton wrote:

 2009/2/17 Joe Orton jor...@redhat.com:

 I did used to perform a dup, but was told that this would cause
 problems with file locking. Specifically was told:

 I'm getting lost here.  What has file locking got to do with it?  Does
 mod_wscgi rely on file locking somehow?


 I'm lost as well :)

Consider:

  fd1 = 

  lock(fd1)

  fd2 = dup(fd1)

  close(fd2) # will release the lock under some lock APIs even though
not last reference to underlying file object

  write(fd1) # lock has already been released so not gauranteed that only writer

  close(fd1)

At least that is how I understand it from what is being explained to
me and pointed out in various documentation.

So, if fd2 is the file descriptor created for file bucket in Apache,
if it gets closed before application later wants to write to file
through fd1, then application has lost its exclusive ownership
acquired by way of the lock and something else could have acquired
lock and started modifying it on basis that it has exclusive onwership
at that time.

 In WSGI applications, it is possible for the higher level Python web
 application to pass back a file object reference for the response with
 the intent that the WSGI adapter use any optimised methods available
 for sending it back as response. This is where file buckets come into
 the picture to begin with.

 Now it looks that you are trying to intermix the third party
 maintained native OS file descriptors and file buckets.
 You can create the apr_file_t from apr_os_file_t

Which is what it does. Simplified code below:

  apr_os_file_t fd = -1;
  apr_file_t *tmpfile = NULL;

  fd = PyObject_AsFileDescriptor(filelike);

  apr_os_file_put(tmpfile, fd, APR_SENDFILE_ENABLED, self-r-pool);

 (Think you'll have platform portability issues there)

The optimisation is only supported on UNIX systems.

 but the major problem would be to ensure the life cycle
 of the object, since Python has it's own GC and httpd has
 it's pool.
 IMHO you will need a new apr_bucket provider written in
 Python and C for something like that.

CPython uses reference counting. What is referred to as GC in Python
is actually just a mechanism that kicks in under certain circumstances
to break cycles between reference counted objects.

Having a special bucket type which holds a reference to the Python
file object will not help anyway. This is because the close() method
of the Python file object can be called prior to the file bucket being
destroyed. This closing of the Python file object would occur before
the delayed write of file bucket resulting due to the EOS
optimisation. So, same problem as when using naked file descriptor.

Also, using a special bucket type opens another can of works. This is
because multiple interpreters are supported as well as multithreading.
Thus it would be necessary to track the named interpreter in use
within the bucket and have to reaquire the lock on the interpreter
being used and ensure thread state is correctly reinstated. Although
possible to do, it gets a bit messy.

Holding onto the file descriptor to allow the optimisation isn't
really desirable for other reasons as well. This is because the WSGI
specification effectively requires the response content to have been
flushed out to the client before the final call back into the
application to clean up things. In the final call back into the
application to perform cleanup and close stuff like files, it could
technically rewrite the content of the file. If Apache has not
finished writing out the contents of the file, presuming the Python
file object hadn't been closed, then Apache would end up writing
different content to what was expected and possibly truncated content
if file resized.

Summary, you need to have a way of knowing that when you flush
something that it really has been flushed and that Apache is all done
with it.

Graham


Re: Problems with EOS optimisation in ap_core_output_filter() and file buckets.

2009-02-17 Thread Graham Dumpleton
2009/2/17 Mladen Turk mt...@apache.org:
 Graham Dumpleton wrote:

 2009/2/17 Mladen Turk mt...@apache.org:

 Graham Dumpleton wrote:

 2009/2/17 Joe Orton jor...@redhat.com:

 I did used to perform a dup, but was told that this would cause
 problems with file locking. Specifically was told:

 I'm getting lost here.  What has file locking got to do with it?  Does
 mod_wscgi rely on file locking somehow?

 I'm lost as well :)

 Consider:

  fd1 = 

  lock(fd1)

  fd2 = dup(fd1)

  close(fd2) # will release the lock under some lock APIs even though
 not last reference to underlying file object

  write(fd1) # lock has already been released so not gauranteed that only
 writer

  close(fd1)

 At least that is how I understand it from what is being explained to
 me and pointed out in various documentation.

 So, if fd2 is the file descriptor created for file bucket in Apache,
 if it gets closed before application later wants to write to file
 through fd1, then application has lost its exclusive ownership
 acquired by way of the lock and something else could have acquired
 lock and started modifying it on basis that it has exclusive onwership
 at that time.


 Well, like said that won't work, neither is portable
 (eg, apr_os_file_t is HANDLE on win32)

I already said I only support the optimisation on UNIX. I don't care
about Windows.

 What you will need is the code that will take the Python
 object and invoke Python file api feeding the apr_bucket.
 (Basically writing the apr_bucket_python_file).

As I already tried to explain, even for the case of the bucket being
used to hold a reference to the Python object, that will not work
because of the gaurantees that WSGI applications require regarding
data needing to be flushed.

 However the simplest thing might be an intermediate temp file, in
 which case httpd could reference the file name not the file
 object itself.

Which would likely be slower than using existing fallback streaming
mechanism available that reads file into memory in blocks and pushes
them through as transient buckets.

 Not sure how woule that work with dynamic file
 since apr and python might use different platform locking
 mechanisms.

Python uses operating system locking mechanisms, just like APR library would.

Graham


Porting custom auth module to Apache 2.2

2009-02-17 Thread Jouni Mäkeläinen
Hi,

I have developed a custom authentication module for Apache 2.0 using a similar 
module as model. Authentication module first checks if request URI contains hex 
coded and DES-encrypted string in query string. If URL doesn't contain param, 
then cookies are checked. If encrypted string is not found, user is redirected 
to separate authentication server and after authentication back to original URL 
with required parameter. 

When module calls ap_auth_type(r) function, segmentation fault occurs. 

First I assumed that the module fails because of the changed AAA-architecture 
in Apache 2.2. Then I came across with mod_auth_kerb, which should work also 
with Apache 2.2 and has a similar approach to authentication. 

Here are the relevant parts of the module code:

static int authenticate_user(request_rec *r) {
xxx_auth_config_rec *conf = ap_get_module_config(r-per_dir_config, 
auth_xxx_module);
const char* encrypted_sso_str = NULL;
if (r-args) {
apr_table_t* qs = val_str2apr_table(r-pool, r-args, );
encrypted_sso_str = apr_table_get(qs, conf-paramname);
apr_table_clear(qs);
}
if (!encrypted_sso_str) {
const char* cookie_str = apr_table_get(r-headers_in, Cookie);
...
}

if (!encrypted_sso_str || apr_strnatcmp(encrypted_sso_str, false) == 0) {
// encrypted_sso_str not found, redirecting user to auth server (first 
check the auth_type) 
if (tmp_auth_type  apr_strnatcasecmp(ap_auth_type(r), auth_xxx) == 
0) {
*** BANG *** (ap_auth_type)
...
static void mod_auth_xxx_register_hooks(apr_pool_t *p)
{
// APR_HOOK_FIRST to bypass other modules, tried also APR_HOOK_MIDDLE
ap_hook_check_user_id(authenticate_user,NULL,NULL,APR_HOOK_FIRST);
}
...
module AP_MODULE_DECLARE_DATA auth_xxx_module =
{
STANDARD20_MODULE_STUFF,
create_auth_dir_config, /* per-directory config creater */
NULL,   /* dir merger --- default is to 
override */
NULL,   /* server config creator */
NULL,   /* server config merger */
auth_commands,  /* command table */
mod_auth_xxx_register_hooks,/* callback for registering hooks */
};

In server configuration I have following common authentication lines:
Location ...
... 
AuthType auth_xxx
require valid-user
...
/Location

I compile module with apxs (CentOS 5.2 x86_64, Apache 2.2.3, tried also Apache 
2.2.8) against libmcrypt (for DES calculations):
apxs -lmcrypt -c mod_auth_xxx.c 
Compilation generates some warnings, but nothing serious I guess. After 
compilation I copy .libs/mod_auth_xxx.so to modules directory 
(/usr/lib64/httpd/modules/) and restart the httpd server. Everything seems to 
work as expected, but when I try to access protected file process dies with 
segmentation fault. Here is the backtrace from the core dump:
#0  0x2af41b58b67f in apr_match_glob () from /usr/lib64/libapr-1.so.0
#1  0x2af4249ebb74 in authenticate_user (r=0x2af42ed75488) at 
mod_auth_xxx.c:159
#2  0x2af419cc5112 in ap_run_check_user_id () from /usr/sbin/httpd
#3  0x2af419cc6327 in ap_process_request_internal () from /usr/sbin/httpd
#4  0x2af419cd7eb8 in ap_process_request () from /usr/sbin/httpd
#5  0x2af419cd50f0 in ap_register_input_filter () from /usr/sbin/httpd
#6  0x2af419cd11c2 in ap_run_process_connection () from /usr/sbin/httpd
#7  0x2af419cdbe5b in ap_graceful_stop_signalled () from /usr/sbin/httpd
#8  0x2af419cdc0ea in ap_graceful_stop_signalled () from /usr/sbin/httpd
#9  0x2af419cdc1a0 in ap_graceful_stop_signalled () from /usr/sbin/httpd
#10 0x2af419cdccd8 in ap_mpm_run () from /usr/sbin/httpd
#11 0x2af419cb7183 in main () from /usr/sbin/httpd

Any help would be most welcome,
Jouni


Re: use of APR_SENDFILE_ENABLED in mod_disk_cache

2009-02-17 Thread Brian Akins
On 2/16/09 5:06 AM, Niklas Edmundsson ni...@acc.umu.se wrote:

 +core_dir_config *coreconf = ap_get_module_config(r-per_dir_config,
 + core_module);


This is a perfect example of why we need a call to hide core_module stuff
from modules.  We talked about this before and we are still propagating
this, IMO, bad habit.

--bakins