On Aug 30, 2006, at 6:17 PM, Christopher Jett wrote:
I am needing to validate the contents of an EditField as a properly
formatted URL. Does someone have a good suggestion on how to
accomplish this? Perhaps a REGEX?
This is probably more than you want to hear, but here it goes...
There are basically two parts to a URL (but sometimes three); the
protocol and the address. These parts are separated by a "://"
string. So the first task is to separate the protocol from the
address, which you can do like this:
Function IsValidURL(source As String) As Boolean
Since a valid URL *cannot* have any spaces, it is easy to test for
them and reject the URL if there are any. The Trim() function below
removes any whitespace in the front or at the end of the string, so
you can ignore any accidental invisible characters. We only really
want to reject the string if there is a space within the URL.
Dim tsource As String = Trim(source)
If (tsource.InStr(" ") > 0) Then Return False
Dim s() As String = Split(tsource, "://")
The great thing about the Split command is that it will return the
entire string in slot s(0) if the substring is not found. Therefore
we can test if the source string had "://" by checking to see if
UBound(s) > 0. However, there can be exactly one "://" sequence for
a valid URL, so you should reject it if it does not split into
exactly two parts.
If UBound(s) <> 1 Then Return False
You can *assume* that the user forgot to put in "http://", and add it
in for them. That is your call, but for the purpose of these
function it is not a valid URL.
There is an optional third part to a URL; and that is the username
and password. This is separated by the "@" symbol, and anything that
is before the "@" is *not* part of the website address. This used to
be a common trick by spammers making you think that you were going to
a different website. for example, this URL actually points to your
local computer:
http://www.realsoftware.com/download/[EMAIL PROTECTED]/
So you should really test for the "@" symbol also with the same
approach above:
Dim a() As String = Split(s(1), "@")
We don't care about the user name and password, if there is more than
one element to the a(), we can just copy the second part to our s()
variable.
If UBound(a) > 0 Then s(1) = a(1)
Finally, you are left with the address and file request portion of
the URL. In order for an url address to be valid, it can either be
in the format of an IP address (192.168.1.1) or a domain name
(www.website.com). Either method requires the use of a period
symbol, so you could just test for that. But in order to make sure
it is part of the address, you need to make sure the period comes
before the first "/" character (if present). The code above will
return False only in the case of something like "absdfajl/test.html".
If (s(1).InStr("/") > 0) And (s(1).InStr(".") > s(1).InStr("/"))
Then Return False
// if you got this far then the URL is good
Return True
End Function
This is just about all that you need to verify the address of the URL
as properly formatted. There is no need to check for ".com" or other
top-level domains because a URL does not have to point to a website.
For example, Appletalk allows you to communicate with other computers
via a URL written like "afp://myG4.local/". There are also addresses
for network printers which use URL; and anyone can write a custom
protocol and tag. Which goes to the final point...
The above code can be used to determine if a string is a valid URL,
but the protocol has not been tested. There can be http, https, ftp,
ftps, and feed to just name a few more common ones. If you care
about which protocols you are willing to support, then you can add
additional tests for this. For example, you can also enter a
"supported protocols" parameter with the protocols you wish to
validate the URL. I would add a single string and comma or semicolon
separate the items: "http;https;feed".
The function without all of the comments:
Function IsValidURL(source As String) As Boolean
// Get rid of accidental whitespace
Dim tsource As String = Trim(source)
// reject if any space characters remain
If (tsource.InStr(" ") > 0) Then Return False
// separate the two main parts
Dim s() As String = Split(tsource, "://")
// if not exactly two parts, then reject as valid URL
If UBound(s) <> 1 Then Return False
// ignore the username:password format separated by the "@" symbol
Dim a() As String = Split(s(1), "@")
If UBound(a) > 0 Then s(1) = a(1)
// make certain there is at least one period symbol before the
first "/"
If (s(1).InStr("/") > 0) And (s(1).InStr(".") > s(1).InStr("/"))
Then Return False
// if you got this far then the URL is good
Return True
End Function
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>