From:             
Operating system: Windows Server 2008
PHP version:      5.3.10
Package:          PCRE related
Bug Type:         Bug
Bug description:Unexplained bool(false) returned from preg_match

Description:
------------
PHP VC9 x86 Thread Safe (from http://windows.php.net/download/)

Using a regex to validate if a string is a valid hostname (host or FQDN).

It seems that for certain length strings trying to match a literal period
at the end will cause the preg_match to return false if the string does not
have a period in it. It also will return false if the string has a period
at the end, and the regex does not try to match them.

The regex is using subpatterns ()to apply the zero or more repetition
quantifier *. I tried with both capturing and non-capturing (?:), both
yield the same result. However, if I use the one or more quantifier + it
does not return bool(false). Using {0,} instead of * does not change the
outcome.

It seems that the cutoff length for the string is about 20 characters. Less
than that, the results are int(0) or int(1) depending on if the regex
matches, longer than that, and bool(false) is returned.

If the subpattern is part of a longer string, it does work as anticipated.

Matching a literal period at the beginning of the pattern does not yield an
error.

Substituting a-zA-Z0-9 for the [:alnum:] character class does not affect
the results.

error_get_last() does not return anything, nothing is showing up in logs
with error_reporting(-1) set either.

Test script:
---------------
$regexs = array
(
        '/^[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*$/',
        '/^(?:[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*\.)*$/',
        '/^(?:[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*\.)+$/',
        
'/^(?:[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*\.)*[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*$/'
);

$hosts = array
(
        'ABCDEFGHIJ1234567890.', // long string with period at end
        'ABCDEFGHI234567890.', // slightly shorter string with period at end
        'ABCDEFGHIJ1234567890', // long string no period
        'ABCDEFGHI1234567890', // a little shorter
        'ABCDEFGHI123456789', // even shorter
        'ABCDEFGHIJ-1234567890', // long with hyphen
        'ABCDEFGHIJ-123456789', // sorter with hyphen
        'ABCDEFGHI-123456789', // even shorter with hyphen
        'WWW.ABCDEFGHIJ-1234567890.COM', // a FQDN with long sting and hyphen
        
'WWW.SUB-SUBDOMAIN.SUBDOMAIN.ABCD-EFGH-IJKL-MNOP-QRST-UVWX-YZ-12345-67890-abcd-efgh-hijk.COM'
// a really long FQDN
);

foreach ($regexs as $regex)
{
        echo "\nRegex: $regex\n";

        foreach ($hosts as $host)
        {
                echo "  Host: $host\n";

                $result = preg_match($regex, $host);

                echo '    Result: ';
                if ($result === false)
                {
                        echo '(error) ';
                        print_r(error_get_last()); // never prints anything?
                }
                else
                {
                        echo ($result) ? '(match) ' : '(no match) ';
                }

                var_dump($result);
        }
}

Expected result:
----------------
none of the results should yield bool(false)

Actual result:
--------------
// just the output from the last regex, but others yield bool(false)
Regex:
/^(?:[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*\.)*[[:alnum:]](?:[[:alnum:]\-]*[[:alnum:]])*$/
  Host: ABCDEFGHIJ1234567890.
    Result: (error) bool(false)
  Host: ABCDEFGHI234567890.
    Result: (error) bool(false)
  Host: .ABCDEFGHIJ1234567890
    Result: (no match) int(0)
  Host: ABCDEFGHIJ1234567890
    Result: (error) bool(false)
  Host: ABCDEFGHI1234567890
    Result: (error) bool(false)
  Host: ABCDEFGHI123456789
    Result: (match) int(1)
  Host: ABCDEFGHIJ-1234567890
    Result: (error) bool(false)
  Host: ABCDEFGHIJ-123456789
    Result: (error) bool(false)
  Host: ABCDEFGHI-123456789
    Result: (match) int(1)
  Host: WWW.ABCDEFGHIJ-1234567890.COM
    Result: (match) int(1)
  Host:
WWW.SUB-SUBDOMAIN.SUBDOMAIN.ABCD-EFGH-IJKL-MNOP-QRST-UVWX-YZ-12345-67890-abcd-efgh-hijk.COM
    Result: (match) int(1)

-- 
Edit bug report at https://bugs.php.net/bug.php?id=61018&edit=1
-- 
Try a snapshot (PHP 5.4):            
https://bugs.php.net/fix.php?id=61018&r=trysnapshot54
Try a snapshot (PHP 5.3):            
https://bugs.php.net/fix.php?id=61018&r=trysnapshot53
Try a snapshot (trunk):              
https://bugs.php.net/fix.php?id=61018&r=trysnapshottrunk
Fixed in SVN:                        
https://bugs.php.net/fix.php?id=61018&r=fixed
Fixed in SVN and need be documented: 
https://bugs.php.net/fix.php?id=61018&r=needdocs
Fixed in release:                    
https://bugs.php.net/fix.php?id=61018&r=alreadyfixed
Need backtrace:                      
https://bugs.php.net/fix.php?id=61018&r=needtrace
Need Reproduce Script:               
https://bugs.php.net/fix.php?id=61018&r=needscript
Try newer version:                   
https://bugs.php.net/fix.php?id=61018&r=oldversion
Not developer issue:                 
https://bugs.php.net/fix.php?id=61018&r=support
Expected behavior:                   
https://bugs.php.net/fix.php?id=61018&r=notwrong
Not enough info:                     
https://bugs.php.net/fix.php?id=61018&r=notenoughinfo
Submitted twice:                     
https://bugs.php.net/fix.php?id=61018&r=submittedtwice
register_globals:                    
https://bugs.php.net/fix.php?id=61018&r=globals
PHP 4 support discontinued:          
https://bugs.php.net/fix.php?id=61018&r=php4
Daylight Savings:                    https://bugs.php.net/fix.php?id=61018&r=dst
IIS Stability:                       
https://bugs.php.net/fix.php?id=61018&r=isapi
Install GNU Sed:                     
https://bugs.php.net/fix.php?id=61018&r=gnused
Floating point limitations:          
https://bugs.php.net/fix.php?id=61018&r=float
No Zend Extensions:                  
https://bugs.php.net/fix.php?id=61018&r=nozend
MySQL Configuration Error:           
https://bugs.php.net/fix.php?id=61018&r=mysqlcfg

Reply via email to