php-general Digest 9 Jan 2011 20:06:13 -0000 Issue 7124

Topics (messages 310615 through 310624):

Re: Validate Domain Name by Regular Express
        310615 by: Ashley Sheridan
        310616 by: Per Jessen
        310617 by: tedd
        310618 by: Daniel Brown
        310619 by: Ashley Sheridan
        310620 by: Daniel Brown
        310621 by: Donovan Brooke
        310622 by: Ashley Sheridan

curl & rtmp
        310623 by: Tontonq Tontonq
        310624 by: David Hutto

Administrivia:

To subscribe to the digest, e-mail:
        [email protected]

To unsubscribe from the digest, e-mail:
        [email protected]

To post to the list, e-mail:
        [email protected]


----------------------------------------------------------------------
--- Begin Message ---
On Sun, 2011-01-09 at 11:44 +0800, WalkinRaven wrote:

> Right, RFC 1034 allow valid endless . parts, till the sum length is over 
> 255.
> 
> On 01/09/2011 01:21 AM, TR Shaw wrote:
> > On Jan 8, 2011, at 12:09 PM, Ashley Sheridan wrote:
> >
> >> On Sat, 2011-01-08 at 16:55 +0800, WalkinRaven wrote:
> >>
> >>> PHP 5.3 PCRE
> >>>
> >>> Regular Express to match domain names format according to RFC 1034 -
> >>> DOMAIN NAMES - CONCEPTS AND FACILITIES
> >>>
> >>> /^
> >>> (
> >>>    [a-z]                 |
> >>>    [a-z] (?:[a-z]|[0-9]) |
> >>>    [a-z] (?:[a-z]|[0-9]|\-){1,61} (?:[a-z]|[0-9])                 ) # One 
> >>> label
> >>>
> >>> (?:\.(?1))*+        # More labels
> >>> \.?                 # Root domain name
> >>> $/iDx
> >>>
> >>> This rule matches only<label>  and<label>. but not<label>.<label>...
> >>>
> >>> I don't know what wrong with it.
> >>>
> >>> Thank you.
> >>>
> >>
> >>
> >> I think trying to do all of this in one regex will prove more trouble
> >> than it's worth. Maybe breaking it down into something like this:
> >>
> >> <?php
> >> $domain = "www.ashleysheridan.co.uk";
> >> $valid = false;
> >>
> >> $tlds = array('aero', 'asia', 'biz', 'cat', 'com', 'coop', 'edu', 'gov',
> >> 'info', 'int', 'jobs', 'mil', 'mobi', 'museum', 'name', 'net', 'org',
> >> 'pro', 'tel', 'travel', 'xxx', 'ac', 'ad', 'ae', 'af', 'ag', 'ai', 'al',
> >> 'am', 'an', 'ao', 'aq', 'ar', 'as', 'at', 'au', 'aw', 'ax', 'az', 'ba',
> >> 'bb', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bm', 'bn', 'bo', 'br',
> >> 'bs', 'bt', 'bv', 'bw', 'by', 'bz', 'ca', 'cc', 'cd', 'cf', 'cg', 'ch',
> >> 'ci', 'ck', 'cl', 'cm', 'cn', 'co', 'cr', 'cu', 'cv', 'cx', 'cy', 'cz',
> >> 'de', 'dj', 'dk', 'dm', 'do', 'dz', 'ec', 'ee', 'eg', 'er', 'es', 'et',
> >> 'eu', 'fi', 'fj', 'fk', 'fm', 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf',
> >> 'gg', 'gh', 'gi', 'gl', 'gm', 'gn', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu',
> >> 'gw', 'gy', 'hk', 'hm', 'hn', 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im',
> >> 'in', 'io', 'iq', 'ir', 'is', 'it', 'je', 'jm', 'jo', 'jp', 'ke', 'kg',
> >> 'kh', 'ki', 'km', 'kn', 'kp', 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc',
> >> 'li', 'lk', 'lr', 'ls', 'lt', 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me',
> >> 'mg', 'mh', 'mk', 'ml', 'mm', 'mn', 'mo', 'mp', 'mq', 'mr', 'ms', 'mt',
> >> 'mu', 'mv', 'mw', 'mx', 'my', 'mz', 'na', 'nc', 'ne', 'nf', 'ng', 'ni',
> >> 'nl', 'no', 'np', 'nr', 'nu', 'nz', 'om', 'pa', 'pe', 'pf', 'pg', 'ph',
> >> 'pk', 'pl', 'pm', 'pn', 'pr', 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro',
> >> 'rs', 'ru', 'rw', 'sa', 'sb', 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj',
> >> 'sk', 'sl', 'sm', 'sn', 'so', 'sr', 'st', 'su', 'sv', 'sy', 'sz', 'tc',
> >> 'td', 'tf', 'tg', 'th', 'tj', 'tk', 'tl', 'tm', 'tn', 'to', 'tp', 'tr',
> >> 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'uk', 'us', 'uy', 'uz', 'va', 'vc',
> >> 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws', 'ye', 'yt', 'za', 'zm',
> >> 'zw', );
> >>
> >>
> >> if(strlen($domain<= 253))
> >> {
> >>    $labels = explode('.', $domain);
> >>    if(in_array($labels[count($labels)-1], $tlds))
> >>    {
> >>            for($i=0; $i<count($labels) -1; $i++)
> >>            {
> >>                    if(strlen($labels[$i])<= 63&&  
> >> (!preg_match('/^[a-z0-9][a-z0-9
> >> \-]*?[a-z0-9]$/', $labels[$i]) || preg_match('/^[0-9]+$/',
> >> $labels[$i]) ))
> >>                    {
> >>                            $valid = false;
> >>                            break;  // no point continuing if one label is 
> >> wrong
> >>                    }
> >>                    else
> >>                    {
> >>                            $valid = true;
> >>                    }
> >>            }
> >>    }
> >> }
> >>
> >> var_dump($valid);
> >>
> >>
> >> This matches the last label with a TLD, and each label thereafter
> >> against the standard a-z0-9 and hyphen rule as indicated in the
> >> preferred characters allowed in a label (LDH rule), with the start and
> >> end character in a label isn't a hyphen (oddly enough it doesn't mention
> >> starting with a digit!)
> >>
> >> Also, each label is checked to ensure it doesn't run over 63 characters,
> >> and the whole thing isn't over 253 characters. Lastly, each label is
> >> checked to ensure it doesn't completely consist of digits.
> >>
> >> I've tested it only with my domain so far, but it should work fairly
> >> well. As I said before, I couldn't think of a way to do it all with one
> >> regex. It could probably be done, but would you really want to create a
> >> huge and difficult to read/understand expression just because it's
> >> possible?
> > Ash
> >
> > I doubt its possible since the ccTLD's have valid 3 and more dotted domain 
> > names. You should see .us And .uk doesn't follow the ccTLS rules for .tk 
> > for example.
> >
> > Now, if the purpose is to write a regex for a host name then that's a 
> > different story.
> >
> > Tom
> 


Which is what my code does too, while also checking for label length.

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
Tamara Temple wrote:

> On Jan 8, 2011, at 2:22 PM, Al wrote:
> 
>>
>>
>> On 1/8/2011 3:55 AM, WalkinRaven wrote:
>>> PHP 5.3 PCRE
>>>
>>> Regular Express to match domain names format according to RFC 1034
>>> - DOMAIN
>>> NAMES - CONCEPTS AND FACILITIES
>>>
>>> /^
>>> (
>>> [a-z] |
>>> [a-z] (?:[a-z]|[0-9]) |
>>> [a-z] (?:[a-z]|[0-9]|\-){1,61} (?:[a-z]|[0-9]) ) # One label
>>>
>>> (?:\.(?1))*+ # More labels
>>> \.? # Root domain name
>>> $/iDx
>>>
>>> This rule matches only <label> and <label>. but not
>>> <label>.<label>...
>>>
>>> I don't know what wrong with it.
>>>
>>> Thank you.
>>
>>
>>
>> Look at filter_var()
>>
>> Validates value as URL (according to »
>> http://www.faqs.org/rfcs/rfc2396) ,
>>
> 
> 
> I'm wondering what mods to make for this now that unicode chars are
> allowed in domain names....

You're talking about IDNs ?  The actual domain name is still US-ASCII,
only when you decode punycode do you get UTF8 characters.


-- 
Per Jessen, Zürich (10.1°C)


--- End Message ---
--- Begin Message ---
At 12:15 PM +0100 1/9/11, Per Jessen wrote:
Tamara Temple wrote:

 > I'm wondering what mods to make for this now that unicode chars are
 allowed in domain names....

You're talking about IDNs ?  The actual domain name is still US-ASCII,
only when you decode punycode do you get UTF8 characters.

Per Jessen, Zürich (10.1°C)


Unfortunately, you are correct.

It was never the intention of the IDNS WG for the end-user to see PUNYCODE, but rather that all IDNS be seen by the end-user as actual Unicode code points (Unicode characters). The only browser that currently supports this is Safari.

For example --

http://xn--19g.com

-- is square-root dot com. In all browsers except Safari, PUNYCODE is shown in the address bar, but in Safari it's shown as ˆ.com

The IDNS works, but for fear of homographic attacks IE (and other browsers) will not show the IDNS correctly.

Cheers,

tedd

--
-------
http://sperling.com/

--- End Message ---
--- Begin Message ---
On Sun, Jan 9, 2011 at 11:58, tedd <[email protected]> wrote:
>
> For example --
>
> http://xn--19g.com
>
> -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown
> in the address bar, but in Safari it's shown as ˆ.com

    Not sure if that's a typo or an issue in translation while the
email was being relayed through the tubes, but ˆ.com directs to
xn--wqa.com here.

-- 
</Daniel P. Brown>
Network Infrastructure Manager
Documentation, Webmaster Teams
http://www.php.net/

--- End Message ---
--- Begin Message ---
On Sun, 2011-01-09 at 12:23 -0500, Daniel Brown wrote:

> On Sun, Jan 9, 2011 at 11:58, tedd <[email protected]> wrote:
> >
> > For example --
> >
> > http://xn--19g.com
> >
> > -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown
> > in the address bar, but in Safari it's shown as ˆ.com
> 
>     Not sure if that's a typo or an issue in translation while the
> email was being relayed through the tubes, but ˆ.com directs to
> xn--wqa.com here.
> 
> -- 
> </Daniel P. Brown>
> Network Infrastructure Manager
> Documentation, Webmaster Teams
> http://www.php.net/
> 


^ is to the power of, not square root, which is √, which does translate
to Tedds domain

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
On Sun, Jan 9, 2011 at 12:32, Ashley Sheridan <[email protected]> wrote:
>
> ^ is to the power of, not square root, which is √, which does translate to 
> Tedds domain

    Thanks for the math lesson, professor, but I already knew that.  ;-P

    My point is, and as you can see in the quoted text from my email,
that I don't know if it was a typo on Tedd's part or what, but ^.com
is what came through here.

--
</Daniel P. Brown>
Network Infrastructure Manager
Documentation, Webmaster Teams
http://www.php.net/

--- End Message ---
--- Begin Message ---
Daniel Brown wrote:
On Sun, Jan 9, 2011 at 11:58, tedd<[email protected]>  wrote:

For example --

http://xn--19g.com

-- is square-root dot com. In all browsers except Safari, PUNYCODE is shown
in the address bar, but in Safari it's shown as ˆ.com

     Not sure if that's a typo or an issue in translation while the
email was being relayed through the tubes, but ˆ.com directs to
xn--wqa.com here.


error in translation.

I get the same domain for:
seamonkey
firefox
googlechrome
safari

but yes, the actual square root character appears in safari only.

Interesting!
Donovan




--
D Brooke

--- End Message ---
--- Begin Message ---
On Sun, 2011-01-09 at 12:38 -0500, Daniel Brown wrote:

> On Sun, Jan 9, 2011 at 12:32, Ashley Sheridan <[email protected]> 
> wrote:
> >
> > ^ is to the power of, not square root, which is √, which does translate to 
> > Tedds domain
> 
>     Thanks for the math lesson, professor, but I already knew that.  ;-P
> 
>     My point is, and as you can see in the quoted text from my email,
> that I don't know if it was a typo on Tedd's part or what, but ^.com
> is what came through here.
> 
> --
> </Daniel P. Brown>
> Network Infrastructure Manager
> Documentation, Webmaster Teams
> http://www.php.net/


Sorry, lol!

It came through as an unrecognised character for me, maybe some email
issue then?

Thanks,
Ash
http://www.ashleysheridan.co.uk



--- End Message ---
--- Begin Message ---
does cUrl supports rtmp protocol? if so is there any example? do we need
enable different library? so if not can we save rtmp by curl? if not is
there any other rtmp downloader that u know ?

--- End Message ---
--- Begin Message ---
On Sun, Jan 9, 2011 at 2:58 PM, Tontonq Tontonq <[email protected]> wrote:
> does cUrl supports rtmp protocol? if so is there any example?

These are obvious by searching for the terms, which seem to be quite
specific to have not found an answer in the search engines.

do we need
> enable different library? so if not can we save rtmp by curl? if not is
> there any other rtmp downloader that u know ?

You seem to know enough to have answered this by yourself, almost in
your own questions.

--- End Message ---

Reply via email to