Another question on SCRIPT_NAME, PATH_INFO etc. This time I am after information on what responsibilities an adapter for a specific web server has in respect of removal and/or preservation of repeating slashes in a request URI.
Take for example that a WSGI application is mounted at: /wsgi/a and that the request URI is: REQUEST_URI: '/////wsgi//////a///b//c/d' What should SCRIPT_NAME and PATH_INFO be set to? Should repeating slashes be removed from SCRIPT_NAME so that it matches the normalised mount point, or should the repeating slashes be preserved? Thus should the above REQUEST_URI yield: SCRIPT_NAME: '/wsgi/a' PATH_INFO: '///b//c/d' or perhaps: SCRIPT_NAME: '/////wsgi//////a' PATH_INFO: '///b//c/d' Similarly should repeating slashes be left as is in the PATH_INFO? I note that path_info_pop() in paste says: >>> def call_it(script_name, path_info): ... env = {'SCRIPT_NAME': script_name, 'PATH_INFO': path_info} ... result = path_info_pop(env) ... print 'SCRIPT_NAME=%r; PATH_INFO=%r; returns=%r' % ( ... env['SCRIPT_NAME'], env['PATH_INFO'], result) >>> call_it('/foo', '/bar') SCRIPT_NAME='/foo/bar'; PATH_INFO=''; returns='bar' >>> call_it('/foo/bar', '') SCRIPT_NAME='/foo/bar'; PATH_INFO=''; returns=None >>> call_it('/foo/bar', '/') SCRIPT_NAME='/foo/bar/'; PATH_INFO=''; returns='' >>> call_it('', '/1/2/3') SCRIPT_NAME='/1'; PATH_INFO='/2/3'; returns='1' >>> call_it('', '//1/2') SCRIPT_NAME='//1'; PATH_INFO='/2'; returns='1' The last comment demonstrates the need to treat repeating slashes as a single slash, but also seems to indicate that SCRIPT_NAME can have repeating slashes in it. Running the code yields: BEFORE: {'PATH_INFO': '///b//c/d', 'SCRIPT_NAME': '/////wsgi//////a'} RESULT: 'b' AFTER: {'PATH_INFO': '//c/d', 'SCRIPT_NAME': '/////wsgi//////a///b'} In wsgiref.shift_path_info(), although it also treats repeating slashes as one, it strips all the repeating slashes out. BEFORE: {'PATH_INFO': '///b//c/d', 'SCRIPT_NAME': '/////wsgi//////a'} RESULT: 'b' AFTER: {'PATH_INFO': '/c/d', 'SCRIPT_NAME': '/wsgi/a/b'} What is accepted convention for dealing with repeating slashes. Should any web server adapter leave repeating slashes in both SCRIPT_NAME and PATH_INFO, or should it at least normalise SCRIPT_NAME so that it matches the designated mount point. Thanks in advance. Graham _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com