ID: 21226
Updated by: [EMAIL PROTECTED]
Reported By: [EMAIL PROTECTED]
-Status: Bogus
+Status: Open
Bug Type: *URL Functions
Operating System: w2000
PHP Version: 4.3.0
Assigned To: iliaa
New Comment:
Reopening this bug. A closer look at RFC 2396 indicates that:
"... This "generic URI" syntax consists of a sequence of four main
components:
<scheme>://<authority><path>?<query> ...
.
.. absoluteURI = scheme ":" ( hier_part | opaque_part )
URI that are hierarchical in nature use the slash "/" character for
separating hierarchical components. ...
...
hier_part = ( net_path | abs_path ) [ "?" query ]
net_path = "//" authority [ abs_path ]
abs_path = "/" path_segments
URI that do not make use of the slash "/" character for separating
hierarchical components are considered opaque by the generic URI
parser.
opaque_part = uric_no_slash *uric
uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
"&" | "=" | "+" | "$" | ","
..."
Later in section 3.3 of that RFC the syntax of the path component is
clarified. Similar clarification is made in section 3.2 on what is
considered as a correct authority component.
Bottomline the $url given by the bug reporter is mostly conformant to
being a hierarchical URI in nature, although not the usual case. As
section 3.2 that deals w/ the authority component states that:
"... The authority component is preceded by a double slash "//" and
is
terminated by the next slash "/", question-mark "?", or by the end
of
the URI. Within the authority component, the characters ";",
":",
"@", "?", and "/" are reserved. ..."
And that is reinforced in the BNF syntax later in the RFC. Not sure if
all web servers will interpret correctly a URL w/o a path but w/ a
query part immediately after the authority part, in view of the fact
that the "/' in the path is usually internally mapped by the server to
wherever the physical files are in the filesystem.
The following code works as expected:
$url =
"http://user:[EMAIL PROTECTED]:8080/foo.php?bar=1&boom=0";
print_r(parse_url($url));
Giving as output:
Array
(
[scheme] => http
[host] => www.example.com
[port] => 8080
[user] => user
[pass] => passwd
[path] => /foo.php
[query] => bar=1&boom=0
)
Tested w/ current CVS head on a RH Linux 6.1 machine:
$ php_cvs -v
PHP 4.4.0-dev (cli) (built: Dec 27 2002 14:00:56)
Copyright (c) 1997-2002 The PHP Group
Zend Engine v1.4.0, Copyright (c) 1998-2002 Zend Technologies
as well as 4.3.0 (on the same OS)
$ php -v
PHP 4.3.0 (cli) (built: Dec 29 2002 23:59:53)
Copyright (c) 1997-2002 The PHP Group
Zend Engine v1.3.0, Copyright (c) 1998-2002 Zend Technologies
Previous Comments:
------------------------------------------------------------------------
[2002-12-28 09:10:25] [EMAIL PROTECTED]
Thank you for your works on my report, but I'm suprised you pass this
report as bogus since :
- 'port' was not number in my example, but it was only to be more
comprehensive. Warning is the same with a digit port.
- Same example was working successfully without warning in 4.2 and
previous.
- parse_url function manual doesn't tell about trailing slashes before
path or query part of url (I triple check ;-)
- RFC 2396 (Uniform Resource Identifiers (URI): Generic Syntax) doesn't
specify that you *MUST* have / after port part.
So if you consider these points, either you should modify parse_url
function or modify parse_url documentation. But please do not just pass
report as bogus!!!
At worst put it a 'closed'.
Thanks for your help.
------------------------------------------------------------------------
[2002-12-28 00:39:43] [EMAIL PROTECTED]
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php
Port can only be a numeric number from 0-99999, although in reality the
port range is from 1-65535, clearly a non numeric port number is not
valid, hence invalidating the passed URL.
The 2nd example is also wrong, without the '/' between the port & the
rest of the request the code MUST assume that the following data is
part of the port, hence the URL is not valid once again.
This is NOT a bug.
------------------------------------------------------------------------
[2002-12-27 19:21:59] [EMAIL PROTECTED]
Add / after end of port part is a good solution. Thanks.
Do you consider that it's a bug or parse_url is url RFC compliant ?
------------------------------------------------------------------------
[2002-12-27 19:04:18] [EMAIL PROTECTED]
Seems to come from 'port' part of url.
If we consider this :
user:password@host?foo
works fine,
user:password@host:port?foo
fails
but
user:password@host?foo:port
works and returns port number in $p_url[port] but it's not correct
------------------------------------------------------------------------
[2002-12-27 19:00:53] [EMAIL PROTECTED]
Some more info: port has to be numeric and less than 5 digits long.
The bug part seems to be that parse_url() doesn't recognize the end of
port part if '/' is missing. So
$url="http://user:[EMAIL PROTECTED]:80/?foo";
works as expected.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://bugs.php.net/21226
--
Edit this bug report at http://bugs.php.net/?id=21226&edit=1