Edit report at https://bugs.php.net/bug.php?id=47070&edit=1

 ID:                 47070
 Comment by:         darrel dot opry at gmail dot com
 Reported by:        darrel dot opry at gmail dot com
 Summary:            php_stream_locate_url_wrapper fails without
                     authority section
 Status:             Open
 Type:               Feature/Change Request
 Package:            Streams related
 Operating System:   Ubuntu 8.10
 PHP Version:        5.2.8
 Block user comment: N
 Private report:     N

 New Comment:

Here is my non-C developers take on how this function could be re-implemented 
to 
be a little easier to follow.  My general take is that the 'file' scheme 
handling in this function (authority and path parsing) is out of place and 
should be moved the to actual plain files stream wrapper.

In general I believe stream wrappers and code utilizing stream wrappers could 
be 
made a little more efficient by fully parsing the URL in this function, 
possible 
with parse URL and using the resulting scheme returned to select a stream 
wrapper and passing the results of the same parse_url into the stream methods 
instead of the URL itself. at the very least passing the pre-parsed authority 
and path sections into the stream_open and the parsed out path in to the path. 
This would save developers making a duplicate call to parse_url from 
interpreted 
code.    

/* {{{ php_stream_locate_url_wrapper */
PHPAPI php_stream_wrapper *php_stream_locate_url_wrapper(const char *path, char 
**path_for_open, int options TSRMLS_DC)
{
        HashTable *wrapper_hash = (FG(stream_wrappers) ? FG(stream_wrappers) : 
&url_stream_wrappers_hash);
        php_stream_wrapper **wrapper = NULL;
        const char *ptr_scheme_delimiter = NULL;
        const char *scheme = NULL;
        int scheme_delimiter_position = 0;

        if (path_for_open) {
                *path_for_open = (char*)path;
        }

        if (options & IGNORE_URL) {
                return (options & STREAM_LOCATE_WRAPPERS_ONLY) ? NULL : 
&php_plain_files_wrapper;
        }

        // Loop over path as long as we have valid protocol characters 
[:alpha:,+,-,.] until we reach the scheme delimiter.
        for (ptr_scheme_delimiter = path; isalnum((int)*ptr_scheme_delimiter) 
|| 
*ptr_scheme_delimiter == '+' || *ptr_scheme_delimiter == '-' || 
*ptr_scheme_delimiter == '.'; p++) {
                scheme_delimiter_position++;
        }

        // Why is 'data:' being checked for here?
        if ((*ptr_scheme_delimiter == ':') && (scheme_delimiter_position > 1) 
&& 
(scheme_delimiter_position == 4 && !memcmp("data:", path, 5))) {
                scheme = estrndup(path, scheme_delimiter_position);
        }

        // convert zlib stream wrapper name. This should be removed or 
compress.zlib should be 
        // registered under both schemes until it is officially removed.
        if (scheme_delimiter_position == 5 && !strncasecmp(scheme, "zlib", 5)) {
                scheme = "compress.zlib";
                scheme_delimiter_position = 13;
                php_error_docref(NULL TSRMLS_CC, E_WARNING, "Use of \"zlib:\" 
wrapper is deprecated; please use \"compress.zlib://\" instead");
        }

    // if we matched a schema...
        if (scheme) {
                // attempt to lookup the wrapper from the wrapper hash using 
the 
scheme.
                if (FAILURE == zend_hash_find(wrapper_hash, (char*)scheme, 
scheme_delimiter_position + 1, (void**)&wrapper)) {
                        char wrapper_name[32];
                        if (n >= sizeof(wrapper_name)) {
                                n = sizeof(wrapper_name) - 1;
                        }
                        PHP_STRLCPY(wrapper_name, protocol, 
sizeof(wrapper_name), scheme_delimiter_position);
                        php_error_docref(NULL TSRMLS_CC, E_WARNING, "Unable to 
find the wrapper \"%s\" - did you forget to enable it when you configured 
PHP?", 
wrapper_name);
                        wrapper = NULL;
                }
        }

        // If we didn't find a wrapper yet we should try defaulting to the file 
scheme.
        if (!wrapper)
                // if the configuration says we support wrappers only. We 
should 
exit now before moving farther.
                if(options & STREAM_LOCATE_WRAPPERS_ONLY) {
                        if (options & REPORT_ERRORS) {
                                        php_error_docref(NULL TSRMLS_CC, 
E_WARNING, "Unable to find a suitable stream wrapper for %s.", path);
                        }
                        return NULL;
                }

                // let proceed with the assumption we're working with files and 
try to locate the wrapper.
                if (FG(stream_wrappers)) {
                        /* Check again, the original check might have not known 
the protocol name */
                        if (!wrapper && zend_hash_find(wrapper_hash, "file", 
sizeof("file"), (void**)&wrapper) == FAILURE) {
                                if (options & REPORT_ERRORS) {
                                        php_error_docref(NULL TSRMLS_CC, 
E_WARNING, "file: wrapper is disabled in the server configuration");
                                }
                                return NULL;
                        }
                }
        }

    // if we have a wrappper and matched a scheme and it wasn't file... lets 
handle some errors.
    if (wrapper && scheme && !strncasecmp(scheme, "file", 4) {
                if ((*wrapper)->is_url  && (options & 
STREAM_DISABLE_URL_PROTECTION) == 0 
                        && (!PG(allow_url_fopen) 
                                || (((options & STREAM_OPEN_FOR_INCLUDE) || 
PG(in_user_include)) && !PG(allow_url_include))
                        )
                ) {
                                if (options & REPORT_ERRORS) {
                                        /* protocol[n] probably isn't '\0' */
                                        char *protocol_dup = estrndup(protocol, 
n);
                                        if (!PG(allow_url_fopen)) {
                                                php_error_docref(NULL 
TSRMLS_CC, 
E_WARNING, "%s: wrapper is disabled in the server configuration by 
allow_url_fopen=0", protocol_dup);
                                        } else {
                                                php_error_docref(NULL 
TSRMLS_CC, 
E_WARNING, "%s: wrapper is disabled in the server configuration by 
allow_url_include=0", protocol_dup);
                                        }
                                        efree(protocol_dup);
                                }
                                return NULL;
                }
                return *wrapper;
        }

        // if we've gotten here we're using file whether it was defaulted or 
not....
        // the following code could be simplified if internalized to php plain 
files wrapper.
        // in general the work of handling the authoritry and path sections of 
the URI should be handed off 
        // to the stream wrappers. It would be real nice if the stream wrapper 
methods received a parsed url 
        // in their stream_open methods saving developers the trouble of 
parsing 
out authority and
        // path in interpreted code .


    int localhost = 0;
        if (!strncasecmp(path, "file://localhost/", 17)) {
                localhost = 1;
        }

        // Validate that we're only using localhost
#ifdef PHP_WIN32
        if (localhost == 0 && path[scheme_delimiter_position+3] != '\0' && 
path[scheme_delimiter_position+3] != '/' && path[scheme_delimiter_position+4] 
!= 
':')    {
#else
        if (localhost == 0 && path[scheme_delimiter_position+3] != '\0' && 
path[scheme_delimiter_position+3] != '/') {
#endif
                if (options & REPORT_ERRORS) {
                        php_error_docref(NULL TSRMLS_CC, E_WARNING, "remote 
host 
file access not supported, %s", path);
                }
        return NULL;
        }

        // so fix up the paths for files... 
        if (path_for_open) {
                        /* skip past protocol and :/, but handle windows 
correctly */
                        *path_for_open = (char*)path + 
scheme_delimiter_position 
+ 1;
                        if (localhost == 1) {
                                (*path_for_open) += 11;
                        }
                        while (*(++*path_for_open)=='/');
#ifdef PHP_WIN32
                        if (*(*path_for_open + 1) != ':')
#endif
                                (*path_for_open)--;
                        }
        }

        return &php_plain_files_wrapper;
}


Previous Comments:
------------------------------------------------------------------------
[2011-11-12 05:50:46] dopry at rynassociates dot com

I believe the code that needs to be modified to be in main/streams/streams.c

in the function 
php_stream_locate_url_wrapper

// iterate over path while the current character is alpha numeric
// +, -, or .
for (p = path; isalnum((int)*p) || *p == '+' || *p == '-' || *p == '.'; p++) {
                n++;
}

// if the current value of p is :  and n is not the first character
// and the characters following p are // || the first five charaters are not 
// data:
        if ((*p == ':') && (n > 1) && (!strncmp("//", p+1, 2) || (n == 4 && 
!memcmp("data:", path, 5)))) {
                protocol = path;
        } else if (n == 5 && strncasecmp(path, "zlib:", 5) == 0) {
                /* BC with older php scripts and zlib wrapper */
                protocol = "compress.zlib";
                n = 13;
                php_error_docref(NULL TSRMLS_CC, E_WARNING, "Use of \"zlib:\" 
wrapper is deprecated; please use \"compress.zlib://\" instead");
        }

I believe the solution to be removing (!strncmp("//", p+1, 2) from the if 
statement above so that // is not a required part of the validation. 

The following logic for returning the stream wrapper looks like it can be 
further optimized by re-ordering/grouping some of the conditionals.

------------------------------------------------------------------------
[2009-07-27 08:39:56] j...@php.net

Reclassified.

------------------------------------------------------------------------
[2009-01-15 16:25:37] darrel dot opry at gmail dot com

a note regarding backwards compatibility. This change shouldn't 
interfere with existing user defined stream wrappers, since they 
would not be called with if they used without the authority section 
currently.

------------------------------------------------------------------------
[2009-01-15 16:14:16] darrel dot opry at gmail dot com

I did RTFM. As a developer and user of PHP I disagree with your 
assertion. The use and parsing of the URL schema for streams is not 
the same as that of parse_url which is recommended for parsing 
incoming paths.

The documentation specifically uses the wording URL in many places, 
right to the point of having a setting for allow_url_fopen. 

The issue with the current approach is that it does not recognize 
legally delimited scheme's properly, and pass them to the underlying 
stream wrapper. It seems to make the invalid assumption that :// or 
// is the scheme delimiter when it is in fact :. It's a minor parsing 
issue that could probably be corrected by an experienced C developer 
in under 20 minutes.

Why is this an issue you may ask... 

parse_url properly parses the scheme, user, password, host, port, and 
path. 

If I want to use parse_url within my stream wrapper I have to 
concatenate the host and path elements if I use a URL in the form of 
file://path/to/file.txt or I have to use a url in the form of 
file://localhost/path/to/file.txt if I want to avoid this 
concatenation.

A secondary impact, if I'm storing URLs to resources in the database 
I now have to store the authority section as well. This is suboptimal 
for me as a application developer as the number of resources I'm 
storing references to increases. 

So maybe you should go RYOFM, before dismissing something out of hand 
just because it references and RFC without thinking of the 
implications.

Basically I think most developers really want their stream wrapper 
called if scheme: is properly designated, so they can write standards 
compliant code.

------------------------------------------------------------------------
[2009-01-15 15:26:34] j...@php.net

RTFM: http://www.php.net/fopen  (look for explanation what PHP expects the 
filename parameter to look like...you can put as many RFCs here, but it's never 
said anywhere that fopen() expects some RFCs style scheme.. :)

------------------------------------------------------------------------


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

    https://bugs.php.net/bug.php?id=47070


-- 
Edit this bug report at https://bugs.php.net/bug.php?id=47070&edit=1

Reply via email to